-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should DEDUPLICATE
, FINAL
and ID
be keywords?
#1382
Comments
DEDUPLICATE
, FINAL
and ID
be keywordsDEDUPLICATE
, FINAL
and ID
be keywords?
apache/datafusion#12014 shows the difference and it is related to creating SQL (thank you @samuelcolvin ) Previously when converting a sql statement to string, it would result in SELECT c.id FROM (SELECT j1.j1_id FROM j1) AS c (id) However, now the id is quoted for some reason: SELECT c.\"id\" FROM (SELECT j1.j1_id FROM j1) AS c (id) Full failure is https://github.com/apache/datafusion/actions/runs/10402816913/job/28808419851?pr=12014 thread 'cases::plan_to_sql::roundtrip_statement_with_dialect' panicked at datafusion/sql/tests/cases/plan_to_sql.rs:435:9:
assertion `left == right` failed
left: "SELECT c.id FROM (SELECT j1.j1_id FROM j1) AS c (id)"
right: "SELECT c.\"id\" FROM (SELECT j1.j1_id FROM j1) AS c (id)"
stack backtrace:
0: rust_begin_unwind
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:652:5
1: core::panicking::panic_fmt
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panicking.rs:72:14
2: core::panicking::assert_failed_inner
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panicking.rs:408:17
3: core::panicking::assert_failed
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panicking.rs:363:5
4: roundtrip_statement_with_dialect
at ./tests/cases/plan_to_sql.rs:435:9
5: sql_integration::cases::plan_to_sql::roundtrip_statement_with_dialect::{{closure}}
at ./tests/cases/plan_to_sql.rs:220:42
6: core::ops::function::FnOnce::call_once
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:250:5
7: core::ops::function::FnOnce::call_once
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. |
I have two questions:
I can try and debug this later today. Thank you @samuelcolvin for preparing the upgrade |
It seems that
Then again, the same works fine in clickhouse:
So I'm not really sure what value adding @git-hulk can you explain what those identifiers help with Clickhouse syntax? |
I think ID needs to be added as an keyword in clickhouse to parse certin syntactic constructs and adding new keywords is consistent with adding other new features Maybe we can just special case the conversion of ast --> SQL string in DataFusion to avoid adding the quote on |
👍 on it. |
IMO the tokenizer should consider the dialect when determining whether some identifier is a keyword (though that's a pretty major change so it's probably unrealistic to do it now). |
@jmhain I agree on both points. |
Ok, so I would say the conclusion to the question posed by this ticket "Should DEDUPLICATE, FINAL and ID be keywords?" is : YES However, maybe in the future we will contemplate being more sophisticated / targeted and make keywords dialect specific. |
Thank you everyone for the good discussion |
@samuelcolvin Those keywords were introduced in PR #1359. For example, the keyword
Agree with what @alamb mentioned, keywords may be different from dialect to dialect, and it's hard to avoid similar issues without making keywords dialect-specific. [1] https://clickhouse.com/docs/en/sql-reference/statements/select/from#final-modifier |
#1359 added
DEDUPLICATE
,FINAL
andID
as keywords, apparently with no discussion of that change.This is breaking tests on datafusion, see apache/datafusion#12014 (easy to fix), but more to the point seems like a change of behaviour that might be unwanted for many users.
In particular,
id
is a very column variable or column name, and starting to mark it as a keyword for all dialects seems mistaken.What should we do?
The text was updated successfully, but these errors were encountered: