-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement extensible configuration mechanism #2754
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2754 +/- ##
==========================================
- Coverage 84.95% 84.93% -0.02%
==========================================
Files 271 272 +1
Lines 48053 48119 +66
==========================================
+ Hits 40825 40872 +47
- Misses 7228 7247 +19
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a good start 👍
datafusion/core/src/config.rs
Outdated
//! DataFusion Configuration Options | ||
|
||
use datafusion_common::ScalarValue; | ||
use sqlparser::ast::DataType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using arrow::data_types::DataType
for the type would be easier to understand (as that is the DataType used by ScalarValue
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, this was an auto import and I hadn't noticed
datafusion/core/src/config.rs
Outdated
name: &str, | ||
description: &str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: &str, | |
description: &str, | |
name: impl Into<String>, | |
description: impl Into<String>, |
And then you can change name.to_string()
to name.into()
and description.to_string()
into description.into()
It is a minor thing, but that way the caller can pass in a String
if they have it and avoid the extra copy 🤷
use sqlparser::ast::DataType; | ||
use std::collections::HashMap; | ||
|
||
/// Configuration option "datafusion.optimizer.filterNullJoinKeys" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using .
for namespacing the configuration makes sense to me
@@ -0,0 +1,187 @@ | |||
// Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this module is in datafusion-core
it will not be able to be used in other modules directly. That seems ok for now 🤔 but I haven't fully thought about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Is the next step to have SessionConfig
refactored with ConfigDefinition
?
Yes. I filed #2756 for converting the existing configs to this new mechanism and after that I plan to do some refactoring so that |
Which issue does this PR close?
Closes #138
Rationale for this change
We are getting to the point where there are multiple settings we could add to operators to fine-tune performance. Custom operators provided by crates that extend DataFusion may also need this capability.
I propose that we add support for key-value configuration options so that we don't need to plumb through each new configuration setting that we add.
This also makes it easier for users to change configuration options without code changes by accepting command-line arguments or reading configuration files.
Another advantage of this approach is that we can now automatically generate markdown documentation to include in the user guide.
I filed #2756 as a follow-on to migrate the existing configuration settings to this new approach.
What changes are included in this PR?
Are there any user-facing changes?
No