Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart. #12046

Merged
merged 49 commits into from
Dec 3, 2019

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented Sep 5, 2019

What problem does this PR solve?

This need the parser #547 merged first.

In some corner cases or encountering serious bugs, DDL operations update table structure error occurred, causing TiDB panic and failing to restart the service.

Admin Repair Table t Create Table Statement.

create new tableInfo from create table statement to override bad tableInfo in meta.

What is changed and how it works?

There are 2 steps to fix this:
1 : support special mode to start TiDB
2 : support a way to fix the bad tableInfo meta in KV store

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change

Related changes

  • Need to update the documentation

Release note

  • support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart..

@AilinKid AilinKid changed the title ddl : supprot admin repair table to ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart. Sep 5, 2019
@AilinKid AilinKid changed the title ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart. ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart.[DNM] Sep 5, 2019
@kennytm
Copy link
Contributor

kennytm commented Oct 21, 2019

What would happen with partitioned tables?

@AilinKid
Copy link
Contributor Author

What would happen with partitioned tables?

By now, partitioned tables are not supported to repair in this version. I will focus on it!

@codecov
Copy link

codecov bot commented Oct 22, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@5a589c9). Click here to learn what that means.
The diff coverage is 74.6478%.

@@             Coverage Diff             @@
##             master     #12046   +/-   ##
===========================================
  Coverage          ?   80.1314%           
===========================================
  Files             ?        475           
  Lines             ?     117477           
  Branches          ?          0           
===========================================
  Hits              ?      94136           
  Misses            ?      15888           
  Partials          ?       7453

@AilinKid AilinKid changed the title ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart.[DNM] ddl : support ADMIN REPAIR TABLE to override bad tableInfo in meta & supply a REPAIR MODE for safely restart. Oct 22, 2019
@AilinKid AilinKid requested review from kennytm, crazycs520, zimulala, foreyes and bb7133 and removed request for foreyes October 23, 2019 02:43
@zimulala
Copy link
Contributor

zimulala commented Oct 24, 2019

Related Design Doc : fix broken table

It's the internal docs and we'd better remove it.

domain/domain.go Outdated
// clean the tables and set repaired table.
repairedDB.Tables = []*model.TableInfo{}
repairedDB.Tables = append(repairedDB.Tables, tbl)
RepairDBInfoMap[di.ID] = repairedDB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to check this map when executing create table. The scenario is as follows:
There is a DDL operation that creates the table dbA_tblB before the operation of repairing dbA_tblB is started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, I will check this.

Copy link
Contributor

@djshow832 djshow832 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any test cases?

@@ -724,6 +762,36 @@ func (p *preprocessor) handleTableName(tn *ast.TableName) {
tn.DBInfo = dbInfo
}

func (p *preprocessor) handleRepairName(tn *ast.TableName) {
dbMap := domain.GetDomain(p.ctx).GetTablesInRepair()
// tn only has the Schema here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tn is a variable, addressed.

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider how to cancel this job?

@AilinKid
Copy link
Contributor Author

AilinKid commented Dec 2, 2019

/run-all-tests

repairString = repairString[1 : len(repairString)-1]
}
return strings.FieldsFunc(repairString, func(r rune) bool {
if r == ',' || r == ' ' || r == '"' {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about return if r == ',' || r == ' ' || r == '"'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@@ -104,7 +104,7 @@ func IsJobRollbackable(job *model.Job) bool {
model.ActionTruncateTable, model.ActionAddForeignKey,
model.ActionDropForeignKey, model.ActionRenameTable,
model.ActionModifyTableCharsetAndCollate, model.ActionTruncateTablePartition,
model.ActionModifySchemaCharsetAndCollate:
model.ActionModifySchemaCharsetAndCollate, model.ActionRepairTable:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test if we support canceling the job.

Copy link
Contributor Author

@AilinKid AilinKid Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@AilinKid
Copy link
Contributor Author

AilinKid commented Dec 3, 2019

/run-all-tests

return ErrRepairTableFail.GenWithStackByArgs("Column " + newOne.Name.L + " has lost")
}
if newOne.Tp != old.Tp {
return ErrRepairTableFail.GenWithStackByArgs("Column " + newOne.Name.L + " type should be the same")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the type of column to the error message?
Do we need to check the Flen ?

Copy link
Contributor Author

@AilinKid AilinKid Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't assume the too much element in the old (broken) version is reasonable,providing a certain degree of freedom for more robust repairs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AilinKid AilinKid requested a review from a team as a code owner December 3, 2019 05:40
@ghost ghost requested review from eurekaka and alivxxx and removed request for a team December 3, 2019 05:40
@eurekaka eurekaka removed their request for review December 3, 2019 05:41
@AilinKid AilinKid removed the request for review from alivxxx December 3, 2019 05:43
@AilinKid
Copy link
Contributor Author

AilinKid commented Dec 3, 2019

/run-all-tests

@AilinKid AilinKid added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Dec 3, 2019
@AilinKid AilinKid merged commit 97a4fae into pingcap:master Dec 3, 2019
XiaTianliang pushed a commit to XiaTianliang/tidb that referenced this pull request Dec 21, 2019
@you06 you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants