-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: array list values are leaked on nested unnest
operators
#10689
Conversation
unnest
operators
// If there is only one list column to unnest and it doesn't contain any NULL lists, | ||
// we can return the values array directly without any copying. | ||
if typed_arrays.len() == 1 && typed_arrays[0].null_count() == 0 { | ||
Ok(vec![typed_arrays[0].values().clone()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will return the underlying low level array, which include the row indices filtered out by limit/offset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch ! Thanks @duongcongtoai
Another simple way to reproduce it is
select unnest(column1) from (select * from (values([1,2,3]), ([4,5,6])) limit 1);
I'm not sure if this is a bug in arrow-rs or by design. It only slices the offset data of the ListArray
, without slicing the child data.
I think current impl is correct, because the offset values after slicing is still offsetted based on original array. Else we have to do shifting for each offset elements like
|
I think the shifting operation should be done internally in arrow-rs if needed, as it depends on what the data in the values looks like. Moreover, we don't need to do shifting for FixedSizeListArray. The current solution looks good to me. |
…e#10689) * chore: fix issue + add test * simplify test case
Which issue does this PR close?
Closes #10672 .
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?