Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: fix infinite retry when kv region continuing to return StaleCommand error (#16481) #16529

Merged
merged 6 commits into from
Apr 23, 2020

Conversation

sre-bot
Copy link
Contributor

@sre-bot sre-bot commented Apr 17, 2020

cherry-pick #16481 to release-3.1


What problem does this PR solve?

Issue Number: close #16524

Problem Summary:

tikv region will report staleCommand when it can not catch up the log and will let tidb do retry.

but stale error didn't use backoff util, so it will infinite loop when some kv region keep return stale.

it can be reproduced by TestOnRegionError

this will make SQLs block forever if it hit those regions.

What is changed and how it works?

What's Changed:

do a little backoff before retry stale command,

How it Works:

do backoff, let retry loop has chances be broken by maxSleepTime or kill command

Related changes

  • Need to cherry-pick to the release branch(3.x need modify parser)

Check List

Tests

  • Unit test

Side effects

  • n/a

Release note

fix infinite retry when kv continuing to return staleCommand error


This change is Reviewable

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 17, 2020

/run-all-tests

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 19, 2020

@coocood, @tiancaiamao, @jackysp, @crazycs520, PTAL.

@lysu
Copy link
Contributor

lysu commented Apr 20, 2020

need merge pingcap/parser#816 first

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 22, 2020

@coocood, @tiancaiamao, @jackysp, @crazycs520, PTAL.

@jackysp
Copy link
Member

jackysp commented Apr 22, 2020

need merge pingcap/parser#816 first

It is merged. Please update go.mod

@lysu lysu force-pushed the release-3.1-14a4a4e91624 branch from 0807fe4 to b2e3692 Compare April 22, 2020 11:10
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 22, 2020
@lysu
Copy link
Contributor

lysu commented Apr 23, 2020

@coocood @jackysp PTAL this one too, thx

Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Apr 23, 2020

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 23, 2020
@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 23, 2020

/run-all-tests

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 23, 2020

@sre-bot merge failed.

@zz-jason zz-jason added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. status/PTAL labels Apr 23, 2020
@lysu
Copy link
Contributor

lysu commented Apr 23, 2020

/merge

@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 23, 2020

Sorry @lysu, you don't have permission to trigger auto merge event on this branch.

@lysu
Copy link
Contributor

lysu commented Apr 23, 2020

/run-all-tests

@zz-jason
Copy link
Member

/run-common-test

@lysu
Copy link
Contributor

lysu commented Apr 23, 2020

/run-unit-test

@coocood coocood merged commit 96c08cf into pingcap:release-3.1 Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug. type/3.1-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants