Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some Expr clones in EliminateCrossJoin(3%-5% faster planning) #10430

Merged
merged 4 commits into from
May 11, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 8, 2024

Draft as it builds on #10427

Which issue does this PR close?

Part of #10287

Rationale for this change

I am trying to avoid copying as much in EliminateCrossJoin to speed up planning.

One thing this rule does is quite a few Expr copies while checking exprs for join keys

What changes are included in this PR?

  1. Move management of join key exprs to JoinExprSet struct
  2. Reduce copying of clones internally to JoinExprSet
  3. New tests for JoinExprSet

Are these changes tested?

Existing CI

Are there any user-facing changes?

No functional changes

Performance tests show 3-5% faster for TPCH / TPDS queries

Details

++ critcmp main less_clone_in_eliminate_cross_join
group                                         less_clone_in_eliminate_cross_join     main
-----                                         ----------------------------------     ----
logical_aggregate_with_join                   1.00  1184.8±12.12µs        ? ?/sec    1.02  1207.4±14.82µs        ? ?/sec
logical_plan_tpcds_all                        1.00    156.8±1.94ms        ? ?/sec    1.01    158.1±2.46ms        ? ?/sec
logical_plan_tpch_all                         1.00     16.8±0.24ms        ? ?/sec    1.01     16.9±0.15ms        ? ?/sec
logical_select_all_from_1000                  1.00     17.9±0.11ms        ? ?/sec    1.04     18.7±0.13ms        ? ?/sec
logical_select_one_from_700                   1.00   815.1±20.51µs        ? ?/sec    1.00   813.6±10.21µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   752.3±14.99µs        ? ?/sec    1.02   764.2±14.21µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00   736.0±10.70µs        ? ?/sec    1.01    745.7±7.27µs        ? ?/sec
physical_plan_tpcds_all                       1.00   1308.6±9.71ms        ? ?/sec    1.03   1349.1±7.72ms        ? ?/sec
physical_plan_tpch_all                        1.00     88.6±1.59ms        ? ?/sec    1.05     93.0±1.50ms        ? ?/sec
physical_plan_tpch_q1                         1.00      4.7±0.07ms        ? ?/sec    1.08      5.1±0.07ms        ? ?/sec
physical_plan_tpch_q10                        1.00      4.3±0.08ms        ? ?/sec    1.03      4.4±0.09ms        ? ?/sec
physical_plan_tpch_q11                        1.00      3.8±0.08ms        ? ?/sec    1.05      3.9±0.07ms        ? ?/sec
physical_plan_tpch_q12                        1.00      2.9±0.05ms        ? ?/sec    1.11      3.2±0.05ms        ? ?/sec
physical_plan_tpch_q13                        1.00      2.1±0.04ms        ? ?/sec    1.02      2.1±0.03ms        ? ?/sec
physical_plan_tpch_q14                        1.00      2.5±0.06ms        ? ?/sec    1.10      2.8±0.05ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.6±0.06ms        ? ?/sec    1.07      3.8±0.07ms        ? ?/sec
physical_plan_tpch_q17                        1.00      3.5±0.12ms        ? ?/sec    1.05      3.7±0.08ms        ? ?/sec
physical_plan_tpch_q18                        1.00      3.9±0.07ms        ? ?/sec    1.04      4.1±0.05ms        ? ?/sec
physical_plan_tpch_q19                        1.00      5.8±0.07ms        ? ?/sec    1.08      6.3±0.08ms        ? ?/sec
physical_plan_tpch_q2                         1.00      7.6±0.08ms        ? ?/sec    1.03      7.8±0.11ms        ? ?/sec
physical_plan_tpch_q20                        1.00      4.5±0.10ms        ? ?/sec    1.03      4.6±0.07ms        ? ?/sec
physical_plan_tpch_q21                        1.00      6.1±0.07ms        ? ?/sec    1.02      6.2±0.08ms        ? ?/sec
physical_plan_tpch_q22                        1.00      3.3±0.06ms        ? ?/sec    1.05      3.4±0.07ms        ? ?/sec
physical_plan_tpch_q3                         1.00      3.1±0.06ms        ? ?/sec    1.02      3.1±0.06ms        ? ?/sec
physical_plan_tpch_q4                         1.00      2.2±0.04ms        ? ?/sec    1.04      2.3±0.05ms        ? ?/sec
physical_plan_tpch_q5                         1.00      4.4±0.07ms        ? ?/sec    1.03      4.5±0.09ms        ? ?/sec
physical_plan_tpch_q6                         1.00  1479.7±35.88µs        ? ?/sec    1.09  1616.7±64.90µs        ? ?/sec
physical_plan_tpch_q7                         1.00      5.5±0.11ms        ? ?/sec    1.03      5.7±0.08ms        ? ?/sec
physical_plan_tpch_q8                         1.00      7.2±0.10ms        ? ?/sec    1.02      7.3±0.09ms        ? ?/sec
physical_plan_tpch_q9                         1.00      5.5±0.09ms        ? ?/sec    1.01      5.5±0.07ms        ? ?/sec
physical_select_all_from_1000                 1.00     58.7±0.33ms        ? ?/sec    1.04     61.1±0.87ms        ? ?/sec
physical_select_one_from_700                  1.00      3.6±0.03ms        ? ?/sec    1.02      3.6±0.05ms        ? ?/sec

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels May 8, 2024
@@ -250,90 +265,67 @@ fn find_inner_join(
}))
}

fn intersect(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved into JoinExprSet

accum.push((*left.clone(), *right.clone()));
}
// insert handles ensuring we don't add the same Join keys multiple times
join_keys.insert(left, right);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JoinExprSet does not require cloning to check if the exprs should be inserted

match op {
Operator::Eq => {
if join_keys.contains(&(*left.clone(), *right.clone()))
|| join_keys.contains(&(*right.clone(), *left.clone()))
Copy link
Contributor Author

@alamb alamb May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there were several clones (deep copies) here too to simply check if the the join keys contained the expressions which have been removed.

///
/// 1. Retains insert order
/// 2. Can quickly look up if a pair of expressions are in the set.
#[derive(Debug)]
Copy link
Contributor Author

@alamb alamb May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted all the relevant behavior for managing join keys into this file both to document it as well as to allow checking if a join key is present without having to clone Exprs

/// # Returns
/// * `Some()` when there are few remaining predicates in filter_expr
/// * `None` otherwise
fn remove_join_expressions(expr: Expr, join_keys: &JoinKeySet) -> Option<Expr> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API previously cloned on all paths (including to check join keys) -- the new API does not

Also it never returns Err so I changed the signature to Option from Result<Option<..>>

@alamb alamb changed the title Remove some Expr clones in EliminateCrossJoin Remove some Expr clones in EliminateCrossJoin(2%-5% faster planning) May 9, 2024
@alamb alamb changed the title Remove some Expr clones in EliminateCrossJoin(2%-5% faster planning) Remove some Expr clones in EliminateCrossJoin(3%-5% faster planning) May 9, 2024
@alamb alamb force-pushed the alamb/less_clone_in_eliminate_cross_join branch from 75ef06a to cdda848 Compare May 9, 2024 16:58
@@ -131,7 +131,7 @@ impl OptimizerRule for EliminateCrossJoin {
.map(|f| Some(LogicalPlan::Filter(f)))
} else {
// Remove join expressions from filter:
match remove_join_expressions(predicate, &all_join_keys)? {
match remove_join_expressions(predicate.clone(), &all_join_keys) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clone used to happen inside remove_join_expressions so I moved it up so it is clearer (and sets the stage to avoid the clone entirely in the next PR)

@@ -150,7 +150,7 @@ impl OptimizerRule for EliminateCrossJoin {
/// Returns a boolean indicating whether the flattening was successful.
fn try_flatten_join_inputs(
plan: &LogicalPlan,
possible_join_keys: &mut Vec<(Expr, Expr)>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core change in this PR -- instead of passing around &HashSet or &Vec(), they are encapsulated into a struct now which is much more careful to clone only when needed

@alamb alamb marked this pull request as ready for review May 9, 2024 17:01
possible_join_keys: &[(Expr, Expr)],
all_join_keys: &mut HashSet<(Expr, Expr)>,
possible_join_keys: &JoinKeySet,
all_join_keys: &mut JoinKeySet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is nice unification


/// Return an iterator over the join keys in this set
pub fn iter(&self) -> impl Iterator<Item = (&Expr, &Expr)> {
self.inner.iter().map(|(l, r)| (l, r))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this map here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: It is (annoying) subtle -- this transforms a iterator of &(Expr, Expr) to an iterator of (&Expr, &Expr)

It is somewhat annoying that Rust does this conversion implicitly in the map closure but can't do it when in the return type

@github-actions github-actions bot removed the logical-expr Logical plan and expressions label May 10, 2024
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is 💪

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @alamb

@comphead comphead merged commit 1eff714 into apache:main May 11, 2024
23 checks passed
@alamb
Copy link
Contributor Author

alamb commented May 11, 2024

Thanks for the review @comphead 🚀

@alamb alamb deleted the alamb/less_clone_in_eliminate_cross_join branch May 13, 2024 12:05
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
apache#10430)

* Remove some Expr clones in `EliminateCrossJoin`

* Apply suggestions from code review

Co-authored-by: comphead <[email protected]>

* fix

---------

Co-authored-by: comphead <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants