mutually_exclusive_ranges gives non-deterministic results in certain situations where zero-length ranges are allowed #659

sfc-gh-ancoleman · 2022-08-30T13:27:56Z

Describe the bug

The window functions used in the test only order by the lower_bound_column. If zero_length_ranges are allowed, it is possible for this to result in failure when it shouldn't.

Steps to reproduce

RECORD    STATUS    LOWER_BOUND            UPPER_BOUND  
1         good      2021-12-01 00:00:00    2022-01-01 00:00:00
2         ok        2022-01-01 00:00:00    2022-01-01 00:00:00
3         bad       2022-01-01 00:00:00    2022-02-01 00:00:00

Expected results

This data should pass a mutually_exclusive_ranges test with configuration

lower_bound_column: lower_bound
upper_bound_column: upper_bound
zero_length_range_allowed: true

and fail mutually_exclusive_ranges test with configuration

lower_bound_column: lower_bound
upper_bound_column: upper_bound
zero_length_range_allowed: false

Actual results

Both of the following are recordsets ordered by LOWER_BOUND. The database could produce either of the following orderings of the records given above.

Ordering A:

RECORD    STATUS    LOWER_BOUND            UPPER_BOUND  
1         good      2021-12-01 00:00:00    2022-01-01 00:00:00
2         ok        2022-01-01 00:00:00    2022-01-01 00:00:00
3         bad       2022-01-01 00:00:00    2022-02-01 00:00:00

Ordering B:

RECORD    STATUS    LOWER_BOUND            UPPER_BOUND  
1         good      2021-12-01 00:00:00    2022-01-01 00:00:00
3         bad       2022-01-01 00:00:00    2022-02-01 00:00:00
2         ok        2022-01-01 00:00:00    2022-01-01 00:00:00

If the first ordering is returned, the dataset will pass the test. If the 2nd ordering is returned, the dataset will fail the test. Which ordering is returned appears to be non-deterministic, at least in Snowflake.

Also, at least in Snowflake, there is no guarantee that the same ordering will be used in the different window functions. I.e., it's possible for next_lower_bound to use Ordering B, but is_last_record to use the reverse of Ordering A.

Screenshots and log output

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.8.4

Which database are you using dbt with?

The output of dbt --version:

installed version: 1.0.0
   latest version: 1.0.0

Up to date!

Plugins:
  - snowflake: 1.0.0

Additional context

Are you interested in contributing the fix?

Yes, I am preparing a PR to include the upper_bound_column as a second field in the order by, which would still not quite provide a deterministic ordering, but should provide orderings that are identical in every way that matters for the test.

The text was updated successfully, but these errors were encountered:

joellabes · 2022-09-05T11:25:08Z

This makes a ton of sense, thank you for the well-fleshed out backstory and test case! Off to have a look at the PR now.

sfc-gh-ancoleman added bug Something isn't working triage labels Aug 30, 2022

sfc-gh-ancoleman mentioned this issue Aug 30, 2022

Add upper_bound_column to order by #660

Merged

17 tasks

joellabes removed the triage label Sep 5, 2022

joellabes closed this as completed in #660 Sep 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mutually_exclusive_ranges gives non-deterministic results in certain situations where zero-length ranges are allowed #659

mutually_exclusive_ranges gives non-deterministic results in certain situations where zero-length ranges are allowed #659

sfc-gh-ancoleman commented Aug 30, 2022

joellabes commented Sep 5, 2022

mutually_exclusive_ranges gives non-deterministic results in certain situations where zero-length ranges are allowed #659

mutually_exclusive_ranges gives non-deterministic results in certain situations where zero-length ranges are allowed #659

Comments

sfc-gh-ancoleman commented Aug 30, 2022

Describe the bug

Steps to reproduce

Expected results

Actual results

Screenshots and log output

System information

Additional context

Are you interested in contributing the fix?

joellabes commented Sep 5, 2022