-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: change the evaluation order of columns in Update
and Insert
statements
#57123
Conversation
Hi @joechenrh. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #57123 +/- ##
================================================
- Coverage 73.2085% 73.0050% -0.2036%
================================================
Files 1679 1698 +19
Lines 462531 507757 +45226
================================================
+ Hits 338612 370688 +32076
- Misses 103136 115414 +12278
- Partials 20783 21655 +872
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/retest |
@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@@ -430,8 +443,15 @@ func (e *InsertExec) initEvalBuffer4Dup() { | |||
} | |||
|
|||
// doDupRowUpdate updates the duplicate row. | |||
func (e *InsertExec) doDupRowUpdate(ctx context.Context, handle kv.Handle, oldRow []types.Datum, newRow []types.Datum, | |||
extraCols []types.Datum, cols []*expression.Assignment, idxInBatch int, dupKeyMode table.DupKeyCheckMode, autoColIdx int) error { | |||
func (e *InsertExec) doDupRowUpdate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only fixes the INSERT .. ON DUPLICATE UPDATE ..
case. As I have tested, the UPDATE
path also has the same bug:
CREATE TABLE cache (
cache_key varchar(512) NOT NULL,
updated_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
expired_at datetime GENERATED ALWAYS AS (if(expires > 0, date_add(updated_at, interval expires second), date_add(updated_at, interval 99 year))) VIRTUAL,
expires int(11),
PRIMARY KEY (cache_key) /*T![clustered_index] CLUSTERED */,
KEY idx_c_on_expired_at (expired_at)
);
INSERT INTO cache(cache_key, expires) VALUES ('2001-01-01 11:11:11', 60) ON DUPLICATE KEY UPDATE expires = expires + 1;
update cache set expires = expires + 1 where cache_key = '2001-01-01 11:11:11';
Then the following two queries will have different result:
select /*+ force_index(test.cache, idx_c_on_expired_at) */ cache_key, expired_at from cache order by cache_key;
select /*+ ignore_index(test.cache, idx_c_on_expired_at) */ cache_key, expired_at from cache order by cache_key;
pkg/executor/insert.go
Outdated
if err != nil { | ||
return err | ||
|
||
if _, err := updateRecord( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to move the logic of handling generated column into the updateRecord
? It seems that the function updateRecord
also handles the ON UPDATE
columns (and as we specified all ON UPDATE
column in this functioin, these logic are meaningless).
Another possible solution is to remove the codes related to ON UPDATE
columns in updateRecord
function. However, as the same logic will be used multiple times (in INSERT .. ON DUPLICATE UPDATE
and a normal UPDATE
statement), I prefer to write the codes related to generated column in updateRecord
to avoid repeating the codes.
/ok-to-test |
/retest |
doDupRowUpdate
Update
and Insert
statements
[LGTM Timeline notifier]Timeline:
|
/hold |
/unhold |
/retest |
/retest-all |
/retest-required |
/cherry-pick release-7.5 |
Signed-off-by: ti-chi-bot <[email protected]>
@joechenrh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/cherry-pick release-7.1 |
Signed-off-by: ti-chi-bot <[email protected]>
@joechenrh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/cherry-pick release-8.5 |
@joechenrh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
In response to a cherrypick label: new pull request could not be created: failed to create pull request against pingcap/tidb#release-6.5 from head ti-chi-bot:cherry-pick-57123-to-release-6.5: the GitHub API request returns a 403 error: {"message":"You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later. If you reach out to GitHub Support for help, please include the request ID B9C8:1BF88E:FA678C:1F2CF15:678F16AC and timestamp 2025-01-21 03:38:21 UTC.","documentation_url":"https://docs.github.com/rest/overview/rate-limits-for-the-rest-api#about-secondary-rate-limits","status":"403"} |
Signed-off-by: ti-chi-bot <[email protected]>
What problem does this PR solve?
Issue Number: ref #56829
Problem Summary:
In the previous logic, when we use
UPDATE
orINSERT ON DUPLICATE
, the new row will be generated in the following order:UPDATE
andINSERT
, they are evaluated incomposeGeneratedColumns
anddoDupRowUpdate
respectively.However, auto-generated columns may rely on the on-update-now columns to generate value. For example in #56829 (comment)
expired_at
is generated based on the latest timestamp value fromupdated_at
. So we will get wrongexpired_at
value. Even worse,expired_at
is the part of the indexidx_c_on_expired_at
. So querying dataexpired_at
using index scan and full table scan will get different result, since in full table scan,expired_at
is calculated in real-time.This also explains #56829 (comment) why changing
VIRTUAL
toSTORED
will not yield such error, although this value itself is incorrect.What changed and how does it work?
To address this problem, this PR refactor the logic of
INSERT ON DUPLICATE
andUPDATE
. More specifically:updateRecord
.updateRecord
.errorHandler
function forUPDATE
andINSERT
to handle error/warning inupdateRecord
.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.