refactor: use native execs instead of custom execs #1262

universalmind303 · 2023-07-06T17:26:25Z

For us to be able to serialize the plans via protobuf, it makes more sense to use the native execs when possible.

This PR refactors our current custom file execs to use the already existing datafusion execs.

the tricky part with this is that we now have to manage an object store registry that maps to/from urls.

crates/sqlexec/src/planner/session_planner.rs

universalmind303 · 2023-07-07T17:59:15Z

@scsmithr i think this is mostly ready for review. I still have a bit of cleanup to do & add support for s3, but the pattern seems to work.

I tested via

select * from csv_scan('<LOCAL FILE>');
select * from csv_scan('<HTTP URL>');
select * from csv_scan('<gcs bucket>', '<CREDS>');
create external table test gcs OPTIONS ...;
select * from test;
select * from parquet_scan('<LOCAL FILE>');
select * from parquet_scan('<HTTP URL>');
select * from parquet_scan('<gcs bucket>', '<CREDS>');

they are all using the native datafusion execs.

universalmind303 · 2023-07-07T18:07:07Z

so now since they are using the native execs, the serialized Exec should have all of the information needed for a remote server to deserialize it & create the appropriate object store. Previously, it didn't contain any information about the url & objectstore.

scsmithr · 2023-07-07T18:43:31Z

We're likely going to need to repopulate the object stores here too: https://github.com/glaredb/glaredb/blob/2141e562f0cc80b6d4f01566a6c11efd55a41f86/crates/sqlexec/src/context.rs#L560-L568

This is when some other node updates the catalog, and the node that the session is currently running on picks up the newer catalog.

universalmind303 commented Jul 6, 2023

View reviewed changes

crates/sqlexec/src/planner/session_planner.rs Outdated Show resolved Hide resolved

universalmind303 mentioned this pull request Jul 6, 2023

Support object storage for *_scan functions #1249

Closed

universalmind303 added 5 commits July 7, 2023 12:46

refactor: use native execs instead of custom execs

3930006

wip

c1971a1

wip

1e78cec

run clippy & linting

5dea2d2

wip

70bc689

universalmind303 force-pushed the universalmind303/exec-refactor branch from 7e0a6c5 to 70bc689 Compare July 7, 2023 17:47

remove comment

bc988cb

universalmind303 added 2 commits July 7, 2023 13:44

chore: code cleanup

ecb9ac2

chore: code cleanup

d5fa994

universalmind303 marked this pull request as ready for review July 7, 2023 21:55

universalmind303 requested a review from scsmithr July 7, 2023 21:55

scsmithr approved these changes Jul 7, 2023

View reviewed changes

universalmind303 merged commit 17996a9 into main Jul 7, 2023

universalmind303 deleted the universalmind303/exec-refactor branch July 7, 2023 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: use native execs instead of custom execs #1262

refactor: use native execs instead of custom execs #1262

universalmind303 commented Jul 6, 2023

universalmind303 commented Jul 7, 2023

universalmind303 commented Jul 7, 2023

scsmithr commented Jul 7, 2023

refactor: use native execs instead of custom execs #1262

refactor: use native execs instead of custom execs #1262

Conversation

universalmind303 commented Jul 6, 2023

universalmind303 commented Jul 7, 2023

universalmind303 commented Jul 7, 2023

scsmithr commented Jul 7, 2023