Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncRead/AsyncWrite Poisoning Behaviour #5437

Open
tustvold opened this issue Feb 27, 2024 · 1 comment
Open

AsyncRead/AsyncWrite Poisoning Behaviour #5437

tustvold opened this issue Feb 27, 2024 · 1 comment
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface

Comments

@tustvold
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently where ObjectStore exposes APIs in terms of tokio's AsyncWrite and AsyncRead, any error poisons the entire operation. Subsequent attempts to read/write will likely result in a panic. This is not well documented, and may not be ideal.

Describe the solution you'd like

At the very least we should document the current behaviour, but it is unclear, at least to me, what the "correct" behaviour here even is:

AsyncWrite::poll_write returns when the bytes have been "written" to the writer, including potentially to an in-flight buffer, see here. In the case of WriteMultiPart this means AsyncWrite::poll_write returns Ok before any network to actually write the data to object storage.

Any errors will therefore be surfaced in AsyncWrite::poll_flush or AsyncWrite::poll_shutdown, which presents a few problems:

  • The PutPart implementation retries intermittent errors based on the RetryConfig, and so we must surface any errors to the user
  • It is unclear how the caller can determine from the error what byte range needs to be retried, as part uploads are chunked and parallel
  • It is unclear how the caller could retry this byte range even if it could be ascertained

This all makes me think that the current behaviour is probably the best we can do, short of not using the tokio IO traits, but I wonder if others have any thoughts on this

Describe alternatives you've considered

Additional context

@tustvold tustvold added enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface labels Feb 27, 2024
@tustvold
Copy link
Contributor Author

tustvold commented Feb 27, 2024

One option might be to return the error, but also re-enqueue the operation to run again. That way if polled again it will effectively just try the operation again 🤔

This would be similar to how std::io::BufWriter handles this particular scenario.

It would then be conceivable for code to retry on write error, although it is rather convoluted:

async fn write_all_with_retry<'a, W: AsyncWrite + Unpin>(
    writer: &'a mut W,
    mut buf: &'a [u8],
) -> impl Stream<Item = std::io::Result<usize>> + 'a {
    futures::stream::poll_fn(move |cx| {
        if buf.is_empty() {
            return Poll::Ready(None);
        }
        return Poll::Ready(Some(
            match futures::ready!(Pin::new(&mut *writer).poll_write(cx, buf)) {
                Ok(x) => {
                    buf.consume(x);
                    Ok(x)
                }
                Err(e) => Err(e),
            },
        ));
    })
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface
Projects
None yet
Development

No branches or pull requests

1 participant