Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster Slice::from_reader for bytes feature #22

Merged
merged 3 commits into from
Feb 19, 2025

Conversation

carlsverre
Copy link
Contributor

After reading the 2.6 launch blog post (which is amazing btw) I was inspired to figure out if I could switch over to ByteView for my project. Unfortunately due to some inflexibility in the design of prost/axum it still leaves too much on the table. So then I revisited my bytes implementation to see if I could improve it at all.

One point in the blog post stuck in my mind: zeroing out the buffer. After some investigation I came up with two approaches (with/without unsafe) that eliminate the need to zero out the buffer.

I benchmarked both approaches in the ByteView crate: fjall-rs/byteview#8

Based on the bench results, I'd like to include the unsafe version. However I realize this crate forbid's unsafe code. So I guess, I'd like to know what you think and if we can find a way to get this into value-log.

Much appreciated as always!

@marvin-j97 marvin-j97 added performance enhancement New feature or request labels Feb 16, 2025
@marvin-j97
Copy link
Contributor

marvin-j97 commented Feb 16, 2025

I think I'm fine with the unsafe implementation, especially because it's just 1 line.
It would be great if from_reader could be a function native to Bytes, but I understand it's probably quite a niche (though useful) use case.

@marvin-j97
Copy link
Contributor

marvin-j97 commented Feb 16, 2025

I'm wondering how unsafe Bytes is twice as fast in the benchmarks ("ctor_long from reader") than Byteview, considering Byteview is also not zeroed out. It does have to copy the prefix, but that shouldn't end up costing 10ns I think.
Byteview does not zero out the initial struct, not sure if that makes a noticable performance difference.

@carlsverre
Copy link
Contributor Author

I think I'm fine with the unsafe implementation, especially because it's just 1 line. It would be great if from_reader could be a function native to Bytes, but I understand it's probably quite a niche (though useful) use case.

I can certainly bring it up with the bytes folks to see what they think, but yea it does feel a bit niche. As for getting this into fjall, how would you like me to handle the unsafe aspect? It seems like a bit of a shame to disable #[forbid(unsafe)] when the bytes feature is enabled. I could create the worlds smallest rust crate that just provides this function if you'd like. Let me know how you'd like me to proceed!

I'm wondering how unsafe Bytes is twice as fast in the benchmarks ("ctor_long from reader") than Byteview, considering Byteview is also not zeroed out.

It's very curious! My suspicion is something to do with how the code ends up being optimized. It appears that the bytes code ends up reducing to a bunch of stack manipulation and basically one malloc and one memcpy in the happy path: https://godbolt.org/z/oKv51zKj9

Might be interesting to check ByteView with Godbolt to compare. Unfortunately will require copying in relevant portions of the ByteView code since Godbolt doesn't support importing arbitrary crates afaik.

@marvin-j97
Copy link
Contributor

Hm I would say a separate crate is probably a bit too overkill - my preference would be a safe abstraction in bytes, but not sure if that would get added in. The only alternative would be toggling the allow unsafe rule...

@marvin-j97
Copy link
Contributor

marvin-j97 commented Feb 17, 2025

My suspicion is something to do with how the code ends up being optimized

Got a ~20% improvement by rearranging the code a bit.

@carlsverre
Copy link
Contributor Author

Wow nice job! That's a sweet improvement for a small change.

I'll toggle it for now and then raise an issue to see if it can be upstreamed.

@carlsverre
Copy link
Contributor Author

Raised issue here: tokio-rs/bytes#770

@marvin-j97 marvin-j97 merged commit b140e97 into fjall-rs:main Feb 19, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants