Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: reduce allocations in push down filter #10567

Merged

Conversation

erratic-pattern
Copy link
Contributor

Not sure which issue to link here, but here's some changes to reduce allocations in PushDownFilter, and also clean the code up a bit.

@github-actions github-actions bot added the optimizer Optimizer rules label May 17, 2024
@erratic-pattern erratic-pattern force-pushed the adam/reduce-alloc-push-down-filter branch from d9a4d7f to 17b99b9 Compare May 17, 2024 22:22
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @erratic-pattern
please help me to understand how the approach in the PR differs from what we have now. Its replaces clone on Arcs to Arc::clone, which imho the same https://doc.rust-lang.org/std/sync/struct.Arc.html#cloning-references ?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please help me to understand how the approach in the PR differs from what we have now. Its replaces clone on Arcs to Arc::clone, which imho the same https://doc.rust-lang.org/std/sync/struct.Arc.html#cloning-references ?

I agree with @comphead calling Arc::clone is the same thing, though it is more explicit that only an Arc is being copied so I think it is an improvement in the code

I think I found a few other Expr::clone that this PR avoids, so from that perspective I think it is an improvement.

Thank you @comphead and @erratic-pattern

@@ -802,11 +802,11 @@ impl OptimizerRule for PushDownFilter {
replace_map.insert(expr.display_name()?, expr.clone());
}
let replaced_push_predicates = push_predicates
.iter()
.map(|expr| replace_cols_by_name(expr.clone(), &replace_map))
.into_iter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like one example of less cloning

@@ -829,12 +829,12 @@ impl OptimizerRule for PushDownFilter {
}
LogicalPlan::Join(join) => push_down_join(join, Some(&filter.predicate)),
LogicalPlan::CrossJoin(cross_join) => {
let predicates = split_conjunction_owned(filter.predicate.clone());
let predicates = split_conjunction_owned(filter.predicate);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise here we avoid an additional clone

@@ -861,16 +861,12 @@ impl OptimizerRule for PushDownFilter {
.collect();
let new_predicate: Vec<Expr> = zip
.filter(|(_, res)| res != &TableProviderFilterPushDown::Exact)
.map(|(pred, _)| (*pred).clone())
.map(|(pred, _)| pred.clone())
.collect();

let new_scan = LogicalPlan::TableScan(TableScan {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this also looks like it might save some copying

@alamb alamb merged commit 4b2652f into apache:main May 21, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented May 21, 2024

Thanks again @erratic-pattern

@erratic-pattern
Copy link
Contributor Author

Yes the Arc::clone is not a performance improvement, but was just a way for me to keep track of which clones were "zero copy" while working on this. It is a recommended convention as well, so I think it's worth keeping.

findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants