-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for DataFusion specific statements #1080
Comments
I plan to take this on, probably starting first with |
I think this would be the best outcome -- if possible. I am not quite sure if we should be adding DataFusion specific things into this parser 🤔 Making it easier to extend this parser with custom hooks however sounds like a great idea to me |
Would that suggest that apache/datafusion#4808 would instead be focused on reducing the code of |
I agree reducing the code is a better goal -- I also left some additional comments here: apache/datafusion#4808 (comment) |
Context
arrow-datafusion currently implements its own parser, DFParser which wraps the parser in this crate in order to parse some DataFusion specific statements.
DataFusion issue: apache/datafusion#4808
Aiming to upstream this functionality into this crate to remove the parsing code from DataFusion.
Statements
Currently two custom statements that DataFusion parses:
COPY TO ...
andCREATE EXTERNAL TABLE ...
EXPLAIN
too, but this is only there to support doingEXPLAIN
for those new extensionsCOPY TO
Syntax:
Examples:
This is extremely similar to the
COPY
statement from PostgreSQL that sqlparser-rs already supports, with a few key differences:WITH
keywordCOPY TO
and notCOPY FROM
Points 2, 3 & 4 are non-issues since can inspect the Statement AST to check if want to support this:
https://github.com/sqlparser-rs/sqlparser-rs/blob/a430d1a5a7bb04bbefd0f2fea07bf25c7fbce8b2/src/ast/mod.rs#L1463-L1479
Point 1 is a minor issue as the Statement AST above doesn't specify if there was a
WITH
keyword found when parsing, but at the same time DataFusion could just accept this new syntax since this optional keyword has minimal impact.Point 5 is the major issue, as would either need to modify the fields of the existing
Copy
enum or add a new one specific for DataFusion, since it is a requirement that the keys cannot be constrained (would be parsed asString
).CREATE EXTERNAL TABLE
Syntax:
Example:
This seems to vary heavily from the existing support for
CreateTable
:https://github.com/sqlparser-rs/sqlparser-rs/blob/a430d1a5a7bb04bbefd0f2fea07bf25c7fbce8b2/src/ast/mod.rs#L1561-L1604
WITH HEADER ROW
andWITH ORDER
So might need a new statement for this? Or could try to retrofit onto the existing
CreateTable
statement.Dialect
Also will need a new DataFusion dialect to support the above customization (e.g. to be able to toggle between previous/default behaviour for
COPY TO
to parse options as predefined keys, or as generic).Alternative
Instead of adding/modifying as described, could investigate ways to make it easier for downstream consumers to parse their own custom statements.
I see there is this dialect function:
https://github.com/sqlparser-rs/sqlparser-rs/blob/a430d1a5a7bb04bbefd0f2fea07bf25c7fbce8b2/src/dialect/mod.rs#L171-L175
But this doesn't allow for custom statements.
I'm unsure what this could look like, but worth a thought.
The text was updated successfully, but these errors were encountered: