Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: BytesMut::from_reader #770

Closed
carlsverre opened this issue Feb 18, 2025 · 4 comments
Closed

Idea: BytesMut::from_reader #770

carlsverre opened this issue Feb 18, 2025 · 4 comments

Comments

@carlsverre
Copy link

Is there any interest in a convenience method for initializing a BytesMut from a std::io::Reader with an explicit length? Something like this:

    #[cfg(feature = "std")]
    pub fn from_reader(mut reader: impl std::io::Read, len: usize) -> Result<Self, std::io::Error> {
        let mut bytes = Self::with_capacity(len);

        // SAFETY: we just allocated `len` bytes, and `read_exact` will fail if
        // it doesn't fill the buffer, subsequently dropping the uninitialized
        // BytesMut object
        unsafe {
            bytes.set_len(len);
        }

        reader.read_exact(&mut bytes)?;

        Ok(bytes)
    }

The rational is to allow forbid(unsafe) crates to safely initialize a BytesMut object from a reader given a known length. I'm using this technique in this PR on the fjall/value-log project to improve performance over using BytesMut::zeroed by roughly 50%: https://github.com/fjall-rs/value-log/pull/22/files

The main issue I see with this request/idea is that read_exact can be dangerous to use due to it internally blocking forever on reads. In fjall's case it makes sense, but I could see this not being the best choice in many cases. But I figured it would be worth bringing up and seeing what the community thinks!

In other news, huge thanks to the maintainers and contributers of bytes for such a great project!

@Darksonn
Copy link
Contributor

The proposed implementation is incorrect. The reader could read the uninitialized bytes, which is not allowed.

@carlsverre
Copy link
Author

That's a good point. It would be nice if there was a way to explicitly copy into a slice of uninit to ensure that the read impl doesn't do anything naughty. I suppose I could rewrite the read_exact logic in this function to avoid the read impl being able to read the buf. Def open to any advice! And also to be clear, it's only UB if the read impl actually reads from the buffer right? If the impl only copies to the buffer there is no UB.

@marvin-j97
Copy link

marvin-j97 commented Feb 18, 2025

There is a https://doc.rust-lang.org/std/io/trait.Read.html#method.read_buf_exact method:

This is similar to the read_exact method, except that it is passed a BorrowedCursor rather than [u8] to allow use with uninitialized buffers.

However, it's nightly: rust-lang/rust#78485

@Darksonn
Copy link
Contributor

Reading into uninitialized bytes requires the unstable API that marvin metioned. However, in async code, Tokio's AsyncRead trait comes with methods that can do it.

@Darksonn Darksonn closed this as not planned Won't fix, can't repro, duplicate, stale Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants