-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster Slice::from_reader for bytes feature #22
Conversation
I think I'm fine with the unsafe implementation, especially because it's just 1 line. |
I'm wondering how unsafe Bytes is twice as fast in the benchmarks ("ctor_long from reader") than Byteview, considering Byteview is also not zeroed out. It does have to copy the prefix, but that shouldn't end up costing 10ns I think. |
I can certainly bring it up with the bytes folks to see what they think, but yea it does feel a bit niche. As for getting this into fjall, how would you like me to handle the unsafe aspect? It seems like a bit of a shame to disable
It's very curious! My suspicion is something to do with how the code ends up being optimized. It appears that the bytes code ends up reducing to a bunch of stack manipulation and basically one malloc and one memcpy in the happy path: https://godbolt.org/z/oKv51zKj9 Might be interesting to check ByteView with Godbolt to compare. Unfortunately will require copying in relevant portions of the ByteView code since Godbolt doesn't support importing arbitrary crates afaik. |
Hm I would say a separate crate is probably a bit too overkill - my preference would be a safe abstraction in |
Got a ~20% improvement by rearranging the code a bit. |
Wow nice job! That's a sweet improvement for a small change. I'll toggle it for now and then raise an issue to see if it can be upstreamed. |
Raised issue here: tokio-rs/bytes#770 |
After reading the 2.6 launch blog post (which is amazing btw) I was inspired to figure out if I could switch over to ByteView for my project. Unfortunately due to some inflexibility in the design of prost/axum it still leaves too much on the table. So then I revisited my bytes implementation to see if I could improve it at all.
One point in the blog post stuck in my mind: zeroing out the buffer. After some investigation I came up with two approaches (with/without unsafe) that eliminate the need to zero out the buffer.
I benchmarked both approaches in the ByteView crate: fjall-rs/byteview#8
Based on the bench results, I'd like to include the unsafe version. However I realize this crate forbid's unsafe code. So I guess, I'd like to know what you think and if we can find a way to get this into value-log.
Much appreciated as always!