Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MERGE-ATTN](function) fix str_to_date default return type scale for nereids #24932

Merged
merged 3 commits into from
Oct 20, 2023

Conversation

zclllyybb
Copy link
Contributor

@zclllyybb zclllyybb commented Sep 26, 2023

If merge this pr into other branch, must together with #25707

Proposed changes

Issue Number: close #xxx

when we have return type of DatetimeV2, will set its' scale to MAX, which is 6.
btw, fix the wrong result of format string parsing when there's %f and also other things trailing.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@zclllyybb
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.06 seconds
stream load tsv: 554 seconds loaded 74807831229 Bytes, about 128 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162088908 Bytes

@zclllyybb zclllyybb changed the title [fix](function) fix str_to_date default return type scale [fix](function) fix str_to_date default return type scale for nereids Oct 7, 2023
@zclllyybb
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Oct 7, 2023

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.29% (8149/22458)
Line Coverage: 28.42% (65163/229322)
Region Coverage: 27.34% (33752/123470)
Branch Coverage: 24.01% (17205/71646)
Coverage Report: http://coverage.selectdb-in.cc/coverage/55ddb1f626fa3bd09d94ecb0cdf0ea0ac51107f0_55ddb1f626fa3bd09d94ecb0cdf0ea0ac51107f0/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.03 seconds
stream load tsv: 565 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162525377 Bytes

@zclllyybb
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.74 seconds
stream load tsv: 559 seconds loaded 74807831229 Bytes, about 127 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 68 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 35.0 seconds inserted 10000000 Rows, about 285K ops/s
storage size: 17162180271 Bytes

@zclllyybb
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.24% (8154/22500)
Line Coverage: 28.37% (65232/229973)
Region Coverage: 27.29% (33815/123909)
Branch Coverage: 23.99% (17228/71808)
Coverage Report: http://coverage.selectdb-in.cc/coverage/11f6b7e2c4955c4725326dac9d0fde710feca961_11f6b7e2c4955c4725326dac9d0fde710feca961/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.86 seconds
stream load tsv: 558 seconds loaded 74807831229 Bytes, about 127 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17162128675 Bytes

@zclllyybb
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.29% (8145/22446)
Line Coverage: 28.42% (65262/229601)
Region Coverage: 27.10% (33798/124737)
Branch Coverage: 23.89% (17250/72210)
Coverage Report: http://coverage.selectdb-in.cc/coverage/115e784c93ef12ea9c870a71c3a471ec8d34ead9_115e784c93ef12ea9c870a71c3a471ec8d34ead9/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.2 seconds
stream load tsv: 575 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162212912 Bytes

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.29% (8145/22446)
Line Coverage: 28.42% (65261/229601)
Region Coverage: 27.10% (33806/124737)
Branch Coverage: 23.89% (17254/72210)
Coverage Report: http://coverage.selectdb-in.cc/coverage/115e784c93ef12ea9c870a71c3a471ec8d34ead9_115e784c93ef12ea9c870a71c3a471ec8d34ead9/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.77 seconds
stream load tsv: 575 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162243206 Bytes

@zclllyybb
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.69% (8228/22423)
Line Coverage: 28.85% (65951/228623)
Region Coverage: 27.55% (34204/124164)
Branch Coverage: 24.26% (17415/71784)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ec430d79c7537af8d3a6523ea625a3ee039ab47a_ec430d79c7537af8d3a6523ea625a3ee039ab47a/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.12 seconds
stream load tsv: 570 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17162518118 Bytes

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 16, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@BiteTheDDDDt BiteTheDDDDt merged commit dc47087 into apache:master Oct 20, 2023
@zclllyybb zclllyybb deleted the str_to_date branch October 20, 2023 04:56
@zhangstar333 zhangstar333 added not-merge/2.0 do not merge into 2.0 branch dev/2.0.3 and removed dev/2.0.3 not-merge/2.0 do not merge into 2.0 branch labels Oct 20, 2023
@zclllyybb zclllyybb changed the title [fix](function) fix str_to_date default return type scale for nereids [MERGE-ATTN](function) fix str_to_date default return type scale for nereids Oct 20, 2023
xiaokang pushed a commit that referenced this pull request Oct 22, 2023
dutyu pushed a commit to dutyu/doris that referenced this pull request Oct 28, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Yulei-Yang added a commit that referenced this pull request Jan 20, 2025
…d issue (#47129)

### What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
github-actions bot pushed a commit that referenced this pull request Jan 21, 2025
…d issue (#47129)

### What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
github-actions bot pushed a commit that referenced this pull request Jan 24, 2025
…d issue (#47129)

### What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…d issue (apache#47129)

### What problem does this PR solve?

Issue Number: close apache#47105

Related PR: apache#24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants