Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc: Export should emit "done" indicator #110985

Open
miretskiy opened this issue Sep 20, 2023 · 6 comments · May be fixed by #141880
Open

cdc: Export should emit "done" indicator #110985

miretskiy opened this issue Sep 20, 2023 · 6 comments · May be fixed by #141880
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-starter Might be suitable for a starter project for new employees or team members. good first issue T-cdc

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Sep 20, 2023

It could be useful if CDC export emitted "done" indicator.
This may be as simple as allowing "resolved" option to be used w/ export so that
final "resolved" marker file/message is emitted before changefeed terminates.

Jira issue: CRDB-31703

@miretskiy miretskiy added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue E-starter Might be suitable for a starter project for new employees or team members. A-cdc Change Data Capture T-cdc labels Sep 20, 2023
@blathers-crl
Copy link

blathers-crl bot commented Sep 20, 2023

cc @cockroachdb/cdc

@733amir
Copy link

733amir commented Sep 21, 2023

I'm not sure what the issue is, but I can help solve it.

@miretskiy
Copy link
Contributor Author

miretskiy commented Sep 21, 2023

When you start CDC export (e.g. CREATE CHANGEFEED INTO s3... WITH initial_scan=only), the only way to know when it completes is to query the job status. But, you may want consumers not to be dependent on that (maybe they don't even know the job id). You may want consumer to be able to determine if the export finished by looking at s3 bucket/directory. And right now, it's very hard to tell. So, the idea would be to emit a marker file "export.done" or some such to indicate that export completed, so that the consumer can simply watch the directory until file shows up.

One way to accomplish this functionality is to allow "resolved" option to be used when initial_scan=only option specified.

@733amir
Copy link

733amir commented Sep 24, 2023

According to the documentation there are multiple sinks and s3 is one of them. How should we do this "done" indicator with each sink?

@miretskiy
Copy link
Contributor Author

Every sink supports the same interface. For example, EmitResolvedMessage emits resolved message into any sink.
For file based sinks, it writes out a file, for message based sinks (kafka, etc) it sends a message.
This is similar.

@xelathan xelathan linked a pull request Feb 22, 2025 that will close this issue
@xelathan
Copy link

@miretskiy Submitted a PR for this issue if its still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-starter Might be suitable for a starter project for new employees or team members. good first issue T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants