-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support resolve specified lock keys #10292
support resolve specified lock keys #10292
Conversation
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
store/tikv/lock_resolver.go
Outdated
@@ -375,6 +381,7 @@ func (lr *LockResolver) getTxnStatus(bo *Backoffer, txnID uint64, primary []byte | |||
|
|||
func (lr *LockResolver) resolveLock(bo *Backoffer, l *Lock, status TxnStatus, cleanRegions map[RegionVerID]struct{}) error { | |||
tikvLockResolverCountWithResolveLocks.Inc() | |||
cleanWholeRegion := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanWholeRegion := l.TxnSize >= BigTxnThreshold
, no need to update it in the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to distinguish normal resolve and lite resolve in metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to set cleanWholeRegion=false in resolve_lock_lite related code block, this makes the logic more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
em, I don't think this will make logic more clear. A variable that is never changed should be more clear.
store/tikv/lock_resolver.go
Outdated
@@ -33,6 +33,9 @@ import ( | |||
// ResolvedCacheSize is max number of cached txn status. | |||
const ResolvedCacheSize = 2048 | |||
|
|||
// Transaction which involve keys exceed this threshold can treat as `Big Transaction`. | |||
const BigTxnThreshold = 256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is too large. A cop request may need 200+ round trips to resolve lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i change the txn_size to keys_in_region, it is more accurate.
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #10292 +/- ##
================================================
+ Coverage 79.5906% 79.6371% +0.0465%
================================================
Files 415 415
Lines 88131 88141 +10
================================================
+ Hits 70144 70193 +49
+ Misses 12801 12768 -33
+ Partials 5186 5180 -6 |
@disksing @tiancaiamao PTAL |
Signed-off-by: zhangjinpeng1987 <[email protected]>
… into light-resolve-lock
@@ -477,6 +477,7 @@ func (c *twoPhaseCommitter) buildPrewriteRequest(batch batchKeys) *tikvrpc.Reque | |||
LockTtl: c.lockTTL, | |||
IsPessimisticLock: isPessimisticLock, | |||
ForUpdateTs: c.forUpdateTS, | |||
TxnSize: uint64(len(batch.keys)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think use len(c.keys)
is better. It is possible that many batches write to a same region because we limit each batch not exceed 16KB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also should consider this case that a batch write whose index keys are discrete, maybe one region per index key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about set the bigTxnThreshold
a smaller number like 16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Signed-off-by: zhangjinpeng1987 <[email protected]>
Signed-off-by: zhangjinpeng1987 <[email protected]>
@tiancaiamao @lysu PTAL |
LGTM |
It will be useful to add config for the |
LGTM |
/run-all-tests |
only resolve specified keys for small txn
What problem does this PR solve?
Currently
ResolveLock
command will scan the whole range and trying to resolve all keys that transaction involved. In the case of read write conflict is heavy in a hot region,ResolveLock
will called very frequently, and this will cause TiKVscheduler
in very heavy load, and then write requests will be blocked for a long while.What is changed and how it works?
Prewrite
request to indicate how many keys this transaction involved.Should merge after pingcap/kvproto#386 and tikv/tikv#4589