Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: add slow operations log to add indices, make trouble shooting easier. #7060

Merged
merged 8 commits into from
Jul 17, 2018
Merged

ddl: add slow operations log to add indices, make trouble shooting easier. #7060

merged 8 commits into from
Jul 17, 2018

Conversation

winkyao
Copy link
Contributor

@winkyao winkyao commented Jul 16, 2018

What have you changed? (mandatory)

Log the slow operations in DDL adding indices. It is useful when the adding indices become slow, it can help us to find the problem.

What is the type of the changes? (mandatory)

  • Improvement (non-breaking change which is an improvement to an existing feature)

config/config.go Outdated
SlowThreshold: 300,
ExpensiveThreshold: 10000,
QueryLogMaxLen: 2048,
DDLSlowOprThreshold: 300,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 300ms?

@@ -63,6 +63,8 @@ expensive-threshold = 10000
# Maximum query length recorded in log.
query-log-max-len = 2048

ddl-slow-opr-threshold = 300
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add time unit in the comment.

ddl/index.go Outdated
err := iterateSnapshotRows(w.sessCtx.GetStore(), w.table, txn.StartTS(), taskRange.startHandle,
func(handle int64, recordKey kv.Key, rawRow []byte) (bool, error) {
oprEndTime := time.Now()
w.logSlowOperations(oprEndTime.Sub(oprStartTime), "iterateSnapshotRows in fetchRowColVals is slow", 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is slow" is redundant with "SLOW-OPERATIONS".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

ddl/index.go Outdated
@@ -624,6 +642,7 @@ func (w *addIndexWorker) backfillIndexInTxn(handleRange reorgIndexTask) (nextHan
}
return nil
})
w.logSlowOperations(time.Since(oprStartTime), "backfillIndexInTxn is slow", 3000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define a constant for 3000

config/config.go Outdated
@@ -84,6 +84,9 @@ type Log struct {
SlowThreshold uint `toml:"slow-threshold" json:"slow-threshold"`
ExpensiveThreshold uint `toml:"expensive-threshold" json:"expensive-threshold"`
QueryLogMaxLen uint `toml:"query-log-max-len" json:"query-log-max-len"`

// DDLSlowOprThreshold is the threshold of ddl slow operations.
DDLSlowOprThreshold uint `toml:"ddl-slow-opr-threshold" json:"ddl-slow-opr-threshold"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we make this configurable, it's better to use HTTP API, it doesn't require a restart of the server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the config option can be removed.

@winkyao
Copy link
Contributor Author

winkyao commented Jul 16, 2018

@shenli @coocood PTAL

@@ -624,6 +641,7 @@ func (w *addIndexWorker) backfillIndexInTxn(handleRange reorgIndexTask) (nextHan
}
return nil
})
w.logSlowOperations(time.Since(oprStartTime), "backfillIndexInTxn", 3000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a constant threshold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 3s is large enough, so no needs to be configurable.

@coocood
Copy link
Member

coocood commented Jul 16, 2018

LGTM

@coocood coocood added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 16, 2018
@coocood
Copy link
Member

coocood commented Jul 16, 2018

/run-all-tests

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crazycs520 crazycs520 added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 16, 2018
@coocood coocood merged commit 43f0c1e into pingcap:master Jul 17, 2018
@winkyao winkyao deleted the logs_metrics_for_adding_indices branch July 17, 2018 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants