-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocate fewer objects #4447
Allocate fewer objects #4447
Conversation
Needs a rebase. |
@nnethercote Is #3769 related to this PR (seeing nnethercote@42bb602)? |
Tim: the first rule of microbenchmarks: you shouldn't trust microbenchmarks. It's hard to write good ones, and they rarely measure what they're meant to measure. Here's an uninteresting question: is a.length=0 slower than a=[] in a microbenchmark? Here's an interesting questions: does the use of a.length=0 rather than a=[] make pdf.js noticeably slower? I'm confident the answer is "no", though pdf.js lacks good performance benchmarks (AIUI) so it's hard to say for certain. But from my understanding of JS engines, zeroing the length should be no slower -- you're possibly deallocating a slots arrays, but that would happen later during GC anyway. And avoiding the allocation saves memory and reduces the amount of garbage, thus delaying the time until the next GC occurs. |
That sounds good. Thank you for the explanation; that adds context to #3769 which I found interesting. |
I've used the length=0 trick in some of my other patches. In principle I don't think there's anything wrong with #3769's patch, but I haven't seen that code show up as hot in any of my profiling. |
I rebased and repushed. |
@nnethercote Travis is complaining a bit: src/core/evaluator.js: line 1441, col 39, Expected an assignment or function call and instead saw an expression. src/core/evaluator.js: line 1441, col 41, Missing semicolon. |
As in #4368, I will not recommend touching stuff near calcRenderParams. We will use all arguments in the matrix. Temporary commenting will not solve the problem in the future. Not sure about preprocessor args optimization as well for the same reason, also readability of the evaluators loops makes it unmaintainable imho. I like getBytes2/getBytes4 stuff, but let's rename them into getUint16 and getInt32 though. @brendandahl ? |
The patches are ordered by effectiveness. So the calcRenderParams one isn't so important, but the Preprocessor.read() one is -- even after applying the patch, which reduces the object allocations from those two loops by 75%, the remaining 25% still represents the biggest single source of allocations in the entire codebase. In some documents its accounts for over 100,000 object allocations, typically 10--15% of all object allocations. Remove that optimization and the number quadruples. |
This is achieved by adding getBytes2() and getBytes4() to streams, and by changing int16() and int32() to take multiple scalar args instead of an array arg.
I renamed getBytes{2,4}() as getUint{16,32}, and removed the patches modifying W.r.t. the latter, I have a plan to restructure the way data is passed from the |
/botio test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @brendandahl received. Current queue size: 0 Live output at: http://107.21.233.14:8877/320c74ce660ee00/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @brendandahl received. Current queue size: 1 Live output at: http://107.22.172.223:8877/141c7086fb7f371/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/320c74ce660ee00/output.txt Total script time: 25.76 mins
|
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/141c7086fb7f371/output.txt Total script time: 37.02 mins
|
I have a hacked version of Firefox that identifies where object allocations in JS code occurs. With this, I've created a series of patches that reduces the number of allocations that occur when loading/viewing a PDF by up to 50%. All these eliminated allocations are of small, short-lived objects, which might not seem important, but actually are, because it reduces pressure on the garbage collector.
Of the eight documents I've tested, five of them had their peak RSS reduced by 15--40 MiB. And all the changes are pretty simple.