-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UDAF to align Builtin aggregate function #10493
Conversation
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite cool @jayzhan211 and the code is 👌 very nice. Thank you
One potential concern I have here is that I think this PR adds new feaures (e.g. support for filter/order by in user defined aggregates, I think) but not new tests.
However, given that the point of this exercise is to port built in functions to UDAF I think all the new features will be properly tested once that porting is complete
@@ -229,12 +229,15 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |||
)?; | |||
let order_by = (!order_by.is_empty()).then_some(order_by); | |||
let args = self.function_args_to_expr(args, schema, planner_context)?; | |||
// TODO: Support filter and distinct for UDAFs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
aggregate_function::AggregateFunction::Count, | ||
) if args.len() == 1 && is_wildcard(&args[0]) => true, | ||
WindowFunctionDefinition::AggregateUDF(ref udaf) | ||
if udaf.name() == "COUNT" && args.len() == 1 && is_wildcard(&args[0]) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should document the COUNT
somewhere. Also should it do a case insensitive comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name should follow what is defined in Count
struct. case sensitive is not needed, it is strictly comparing with fn name()
.
&stats.num_rows, | ||
agg_expr.as_any().downcast_ref::<expressions::Count>(), | ||
) { | ||
// TODO implementing Eq on PhysicalExpr would help a lot here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html# seems to imply it is possible to to compare PhysicalExprs (perhaps expr.eq(other_expr.as_any())
🤔 )
I don't know PhysicalExpr
doesn't implement PartialEq
directly
pub trait PhysicalExpr: ... PartialEq<dyn Any> {`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not take a look carefully to the comment here, since they are not needed after removing builtin function
Don't worry, the test should be covered after porting the related test, if not I will add it. |
Thanks @alamb |
* align udaf and builtin Signed-off-by: jayzhan211 <[email protected]> * add more Signed-off-by: jayzhan211 <[email protected]> --------- Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
Closes #.
Rationale for this change
Most of the function expect the same args or the same behaviour as builtin aggregate function for UDAF.
This PR change most of them I need for #10484
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?