Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Datafusion doesn't respect case sensitive table references #8964

Merged
merged 6 commits into from
Jan 26, 2024
Merged

bug: Datafusion doesn't respect case sensitive table references #8964

merged 6 commits into from
Jan 26, 2024

Conversation

xhwhis
Copy link
Contributor

@xhwhis xhwhis commented Jan 23, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

Table alias will process uppercase alias into lowercase. This PR will fix it.

What changes are included in this PR?

TableReference::parse_str method converts uppercase to lowercase. The alias here has already undergone normalize, so we can directly create a TableReference as a parameter.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added the sql SQL Planner label Jan 23, 2024
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xhwhis for your contribution
Would you mind adding more details on reason why this change is need, and would be great to cover the change with unit tests

@xhwhis xhwhis closed this Jan 24, 2024
@xhwhis xhwhis reopened this Jan 24, 2024
@xhwhis
Copy link
Contributor Author

xhwhis commented Jan 24, 2024

Thanks @xhwhis for your contribution Would you mind adding more details on reason why this change is need, and would be great to cover the change with unit tests

You can use datafusion-cli to observe the uppercase bug in table alias.

DataFusion CLI v34.0.0
❯ CREATE TABLE IF NOT EXISTS valuetable(c1 INT, c2 VARCHAR) AS VALUES(1,'HELLO'),(12,'DATAFUSION');
0 rows in set. Query took 0.066 seconds.

❯ SELECT "TEST".c1 from valuetable as "TEST";
Schema error: No field named test.c1. Valid fields are test.c1, test.c2.

I believe this PR have a minimal impact. I have added unit tests to cover the changes. Please review. @comphead

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm able to reproduce with

❯ select "T".* from (select 1 a) as "T";
Error during planning: Invalid qualifier T
❯ select t.* from (select 1 a) as "T";
+---+
| a |
+---+
| 1 |
+---+
1 row in set. Query took 0.005 seconds.

Please include this test into the test case, and would be nice to move tests into select.slt sqllogic test file

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jan 24, 2024
@xhwhis
Copy link
Contributor Author

xhwhis commented Jan 24, 2024

Please include this test into the test case, and would be nice to move tests into select.slt sqllogic test file

I have added sqllogictests as per your suggestion.

@@ -4278,6 +4294,23 @@ fn test_table_alias() {
\n TableScan: person";
quick_test(sql, expected);

let sql = "select * from (\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we getting closer, please move those tests to select.rs you may want to use EXPLAIN keyword to get the plan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@comphead comphead changed the title fix: table alias will process uppercase alias into lowercase bug: Datafusion doesn't respect case sensitive table references Jan 25, 2024
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xhwhis we are really close

@@ -19,3 +19,20 @@ query II
select * from (WITH source AS (select 1 as e) SELECT * FROM source) t1, (WITH source AS (select 1 as e) SELECT * FROM source) t2
----
1 1

query I
WITH "T" AS (SELECT 1 a) SELECT * FROM "T"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WITH "T" AS (SELECT 1 a) SELECT * FROM "T"
WITH "T" AS (SELECT 1 a) SELECT "T".* FROM "T"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @xhwhis
since it is first time contribution I'm leaving it for sometime to let other people review as well

@@ -19,3 +19,20 @@ query II
select * from (WITH source AS (select 1 as e) SELECT * FROM source) t1, (WITH source AS (select 1 as e) SELECT * FROM source) t2
----
1 1

query I
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test description

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
query I
# Ensure table aliases can be case sensitive
query I

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

----
1

query TT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -19,3 +19,20 @@ query II
select * from (WITH source AS (select 1 as e) SELECT * FROM source) t1, (WITH source AS (select 1 as e) SELECT * FROM source) t2
----
1 1

query I
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
query I
# Ensure table aliases can be case sensitive
query I

----
1

query TT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
query TT
# Ensure table aliases can be case sensitive
query TT

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @xhwhis -- this looks great. I plan to merge it after CI passes

@alamb alamb merged commit 35c7b2c into apache:main Jan 26, 2024
22 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 26, 2024

Thanks again @xhwhis

@xhwhis xhwhis deleted the table-alias branch January 26, 2024 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants