-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: try to avoid buffer allocations #1998
Conversation
const net = require('net')
const server = net.createServer((socket) => {
socket.resume()
socket.on('error', () => {})
}).listen(0)
const socket = net.connect(server.address().port, '0.0.0.0', () => {
let str = '1'.repeat(4096)
for (let n = 0; n < 1024*128; ++n) {
// socket.write(str)
socket.write(toBuffer(str))
}
socket.end()
socket.on('finish', () => {
socket.on('error', () => {})
server.close()
})
})
function toBuffer(data) {
toBuffer.readOnly = true;
if (Buffer.isBuffer(data)) return data;
let buf;
if (data instanceof ArrayBuffer) {
buf = Buffer.from(data);
} else if (ArrayBuffer.isView(data)) {
buf = Buffer.from(data.buffer, data.byteOffset, data.byteLength);
} else {
buf = Buffer.from(data);
toBuffer.readOnly = false;
}
return buf;
}
It's more than 2x faster without |
8875b42
to
b163d6c
Compare
lib/sender.js
Outdated
const buf = | ||
perMessageDeflate || typeof data !== 'string' ? toBuffer(data) : data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buf.length >= perMessageDeflate._threshold
check below might not work as expected when data
is a string because it does not use the string bytes length.
Also, when data
is a string and permessage-deflate is enabled, toBuffer()
should ideally be called only when the message will actually be compressed (this._compress === true
).
It would be better to make this permessage-deflate agnostic. Removing the perMessageDeflate
check here and passing the string to perMessageDeflate.compress()
should work. The problem is that enqueue()
and dequeue()
currently use the buffer length to update the _bufferedBytes
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That branch is only used if perMessageDeflate
is true; in which case data
is converted to a buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that we could improve it, I don't think there is an issue here currently. Only that it's a little hard to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Anyway atm this improvement only works when permessage-deflate is disabled. What about websocket.ping()
and websocket.pong()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about websocket.ping() and websocket.pong()?
Don't see them as a hot path.
I think the benchmark is a bit fabricated. It writes 512 MiB synchronously by calling I don't think this is realistic. |
I mean the point here is all of those buffer allocations are not necessary. The strings are already allocated. This just demonstrates the overhead of creating all of those buffers that you would otherwise not create. Which is basically what happens in production when you are sending a large amount of strings to ws. i.e. what this PR is trying to do is to avoid some of the overhead of going string -> Buffer. Hence the benchmark should only show the difference between doing the conversion in js before writing to the socket instance or inside the native writeGeneric.
While it could be better maybe I don't see why it wouldn't be realistic. It's basically what's happening in my case, although not with quite such large values as 512 MiB. |
Even with smaller values e.g. 4 MiB there is improvement:
|
Added a cronometro benchmark
|
34cc4e0
to
8ad1e6d
Compare
Without pre-allocating the string:
|
with
The problem with this is that the difference is only the result of the huge amount of buffered data in the socket writable queue. The difference should not be noticeable if data is regularly flowing, because the number of allocations would basically be 1:1. |
Sorry. I don't understand? We're using toBuffer just as it would with ws.
This is with 4MiB of back pressure, which is not unrealistic. |
I don't quite see why you think the benchmark is not relevant. According to you, how would a "correct" benchmark look? |
Yes but we could optimize it for the string case instead of dealing with the string case only at the end. |
Ok updated benchmark:
|
0555b10
to
16a7034
Compare
It's 15% on my machine but only 3% with |
On an EPYC server + Node v17.3.0:
|
e9a4ca2
to
c26cb06
Compare
How to make cronometro wait for the socket to be connected? |
cronometro does a warmup run so waiting for the socket doesn't really make any difference. |
Ok, always sending the same string does not make sense and invalidates the test. |
Added another one. See above. |
That's a ~5% improvement. |
and reduced gc pressure |
btw, I kind of did manage to resolve that with the current version by setting |
Yes, it's a small improvement that has some readability and maintenance drawbacks. We can move this forward but I want to make it general and handle Feel free to reopen and do that. I might open a new PR for this later. |
Anyway, thank you for bearing with me. |
Something like this? |
a5cf313
to
c285e36
Compare
c285e36
to
3bd436c
Compare
Tried cleaning it up a little more with proper commit msgs. A little exhausted on this one ATM. Hope this can be included in some form. Feel free to take over. |
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
Do not convert strings to `Buffer`s if data does not need to be masked. Refs: #1998
No description provided.