Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate fewer objects #4447

Merged
merged 4 commits into from
Mar 17, 2014
Merged

Conversation

nnethercote
Copy link
Contributor

I have a hacked version of Firefox that identifies where object allocations in JS code occurs. With this, I've created a series of patches that reduces the number of allocations that occur when loading/viewing a PDF by up to 50%. All these eliminated allocations are of small, short-lived objects, which might not seem important, but actually are, because it reduces pressure on the garbage collector.

Of the eight documents I've tested, five of them had their peak RSS reduced by 15--40 MiB. And all the changes are pretty simple.

@brendandahl
Copy link
Contributor

Needs a rebase.

@timvandermeij
Copy link
Contributor

@nnethercote Is #3769 related to this PR (seeing nnethercote@42bb602)?

@nnethercote
Copy link
Contributor Author

Tim: the first rule of microbenchmarks: you shouldn't trust microbenchmarks. It's hard to write good ones, and they rarely measure what they're meant to measure.

Here's an uninteresting question: is a.length=0 slower than a=[] in a microbenchmark?

Here's an interesting questions: does the use of a.length=0 rather than a=[] make pdf.js noticeably slower?

I'm confident the answer is "no", though pdf.js lacks good performance benchmarks (AIUI) so it's hard to say for certain. But from my understanding of JS engines, zeroing the length should be no slower -- you're possibly deallocating a slots arrays, but that would happen later during GC anyway. And avoiding the allocation saves memory and reduces the amount of garbage, thus delaying the time until the next GC occurs.

@timvandermeij
Copy link
Contributor

That sounds good. Thank you for the explanation; that adds context to #3769 which I found interesting.

@nnethercote
Copy link
Contributor Author

I've used the length=0 trick in some of my other patches. In principle I don't think there's anything wrong with #3769's patch, but I haven't seen that code show up as hot in any of my profiling.

@nnethercote
Copy link
Contributor Author

I rebased and repushed.

@timvandermeij
Copy link
Contributor

@nnethercote Travis is complaining a bit:

src/core/evaluator.js: line 1441, col 39, Expected an assignment or function call and instead saw an expression.
src/core/evaluator.js: line 1441, col 41, Missing semicolon.

@yurydelendik
Copy link
Contributor

As in #4368, I will not recommend touching stuff near calcRenderParams. We will use all arguments in the matrix. Temporary commenting will not solve the problem in the future. Not sure about preprocessor args optimization as well for the same reason, also readability of the evaluators loops makes it unmaintainable imho. I like getBytes2/getBytes4 stuff, but let's rename them into getUint16 and getInt32 though. @brendandahl ?

@nnethercote
Copy link
Contributor Author

The patches are ordered by effectiveness. So the calcRenderParams one isn't so important, but the Preprocessor.read() one is -- even after applying the patch, which reduces the object allocations from those two loops by 75%, the remaining 25% still represents the biggest single source of allocations in the entire codebase. In some documents its accounts for over 100,000 object allocations, typically 10--15% of all object allocations. Remove that optimization and the number quadruples.

This is achieved by adding getBytes2() and getBytes4() to streams, and by
changing int16() and int32() to take multiple scalar args instead of an array
arg.
@nnethercote
Copy link
Contributor Author

I renamed getBytes{2,4}() as getUint{16,32}, and removed the patches modifying
calcRenderParams() and EvaluatePreprocessor_read().

W.r.t. the latter, I have a plan to restructure the way data is passed from the
worker to the main thread via fnArray/argsArray. If it works out, it will avoid
all the allocations in and around EvaluatePreprocessor_read(), plus all those
for the small arrays within argsArray.

@brendandahl
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://107.21.233.14:8877/320c74ce660ee00/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @brendandahl received. Current queue size: 1

Live output at: http://107.22.172.223:8877/141c7086fb7f371/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/320c74ce660ee00/output.txt

Total script time: 25.76 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/141c7086fb7f371/output.txt

Total script time: 37.02 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

brendandahl added a commit that referenced this pull request Mar 17, 2014
@brendandahl brendandahl merged commit 1802fff into mozilla:master Mar 17, 2014
@nnethercote nnethercote deleted the object-reduction branch March 21, 2014 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants