[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise #7632

Snuffleupagus · 2016-09-14T08:35:25Z

I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up.
It appears that there's a lot of, i.e. way too much, variance between subsequent runs of text tests for the results to be meaningful.
(Previously I've only benchmarked eq tests, so I don't know if the text tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the current master with a very simple manifest file: https://gist.github.com/Snuffleupagus/eec3acc22f519eddce6404be883b6960.)

Instead I used console.time/timeEnd in appendText and expandTextDivs to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: https://gist.github.com/Snuffleupagus/33fc971b653e6524bc889216ff95499c.
Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on).
However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks very likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224).

Re: issue #7584.

/cc @timvandermeij

This change is

Snuffleupagus · 2016-09-14T08:36:29Z

/botio test

pdfjsbot · 2016-09-14T08:36:30Z

From: Bot.io (Linux)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/ef403cd52fad547/output.txt

pdfjsbot · 2016-09-14T08:36:30Z

From: Bot.io (Windows)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/e96954c98dd3cb6/output.txt

pdfjsbot · 2016-09-14T09:00:56Z

From: Bot.io (Windows)

Success

Full output at http://107.22.172.223:8877/e96954c98dd3cb6/output.txt

Total script time: 24.43 mins

Font tests: Passed
Unit tests: Passed
Regression tests: Passed

pdfjsbot · 2016-09-14T09:15:48Z

From: Bot.io (Linux)

Failed

Full output at http://107.21.233.14:8877/ef403cd52fad547/output.txt

Total script time: 39.30 mins

Font tests: Passed
Unit tests: Passed
Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/ef403cd52fad547/reftest-analyzer.html#web=eq.log

Snuffleupagus · 2016-09-14T11:46:24Z

src/display/text_layer.js

-    textDiv.style.fontFamily = style.fontFamily;
+    textDivProperties.style = 'left: ' + left + 'px; top: ' + top +
+                              'px; font-size: ' + fontHeight +
+                              'px; font-family: ' + style.fontFamily + ';';


So, we might actually want to do something similar to PR #5033 for this string!?

Sure, I'm fine with that, as long as we keep a similar comment above it (since otherwise it may look like magic that may be removed later on).

Maybe you could check how often this method is called for the one-glyph-per-div case to determine whether we actually need this optimization or not? If it's called over a thousand times, it should be useful, while otherwise it may be unnecessary. I'm fine with both ways, so let's pick the one you think is most useful here.

For a good case, such as the tracemonkey paper, there seem to be around 150 divs per page. So even for well behaved documents, it might make sense to avoid these intermediate strings when considering an entire document.

For a really bad case, such as the PDF from issue 7224, there's around 2000 divs per page, and in this case it probably makes a lot of sense to avoid intermediate strings.

I'll update the code, and add a comment about why we're doing this.

…ng all styles at once instead of piecewise I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up. It appears that there's a lot of, i.e. way too much, variance between subsequent runs of `text` tests for the results to be meaningful. (Previously I've only benchmarked `eq` tests, so I don't know if the `text` tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the current `master` with a *very* simple manifest file: [link here].) Instead I used `console.time/timeEnd` in `appendText` and `expandTextDivs` to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: [link here]. Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on). However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks *very* likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224). Re: issue 7584.

Snuffleupagus · 2016-09-15T07:28:54Z

Comments have been addressed.

/botio test

pdfjsbot · 2016-09-15T07:28:54Z

From: Bot.io (Linux)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/95a4df768e50969/output.txt

pdfjsbot · 2016-09-15T07:28:54Z

From: Bot.io (Windows)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/c56b1885962fbb2/output.txt

pdfjsbot · 2016-09-15T07:53:30Z

From: Bot.io (Windows)

Success

Full output at http://107.22.172.223:8877/c56b1885962fbb2/output.txt

Total script time: 24.59 mins

Font tests: Passed
Unit tests: Passed
Regression tests: Passed

pdfjsbot · 2016-09-15T08:08:26Z

From: Bot.io (Linux)

Failed

Full output at http://107.21.233.14:8877/95a4df768e50969/output.txt

Total script time: 39.52 mins

Font tests: Passed
Unit tests: Passed
Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/95a4df768e50969/reftest-analyzer.html#web=eq.log

Snuffleupagus · 2016-09-15T10:05:23Z

/botio-linux preview

pdfjsbot · 2016-09-15T10:05:23Z

From: Bot.io (Linux)

Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/2e4660abe7a0f40/output.txt

pdfjsbot · 2016-09-15T10:06:42Z

From: Bot.io (Linux)

Success

Full output at http://107.21.233.14:8877/2e4660abe7a0f40/output.txt

Total script time: 1.32 mins

Published

timvandermeij · 2016-09-15T13:22:41Z

/botio makeref

pdfjsbot · 2016-09-15T13:22:41Z

From: Bot.io (Windows)

Received

Command cmd_makeref from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/8ec5f6a456954e4/output.txt

pdfjsbot · 2016-09-15T13:22:41Z

From: Bot.io (Linux)

Received

Command cmd_makeref from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/533677ffa739ff2/output.txt

pdfjsbot · 2016-09-15T13:46:57Z

From: Bot.io (Windows)

Success

Full output at http://107.22.172.223:8877/8ec5f6a456954e4/output.txt

Total script time: 24.27 mins

Lint: Passed
Make references: Passed
Check references: Passed

pdfjsbot · 2016-09-15T14:00:50Z

From: Bot.io (Linux)

Success

Full output at http://107.21.233.14:8877/533677ffa739ff2/output.txt

Total script time: 38.15 mins

Lint: Passed
Make references: Passed
Check references: Passed

timvandermeij · 2016-09-15T14:01:34Z

Awesome work!

…andTextDivs [EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise

Snuffleupagus added performance text-selection labels Sep 14, 2016

Snuffleupagus reviewed Sep 14, 2016
View reviewed changes

timvandermeij merged commit 26da2d5 into mozilla:master Sep 15, 2016

Snuffleupagus deleted the more-efficient-expandTextDivs branch September 15, 2016 16:23

Snuffleupagus mentioned this pull request Feb 13, 2017

text layer style micro-optimization resulted in requiring style-src 'unsafe-inline' in Content Security Policy #8066

Closed

movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018

Merge pull request mozilla#7632 from Snuffleupagus/more-efficient-exp…

5870ff0

…andTextDivs [EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise #7632

[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise #7632

Snuffleupagus commented Sep 14, 2016 •

edited

Loading

Snuffleupagus commented Sep 14, 2016

pdfjsbot commented Sep 14, 2016

pdfjsbot commented Sep 14, 2016

pdfjsbot commented Sep 14, 2016

pdfjsbot commented Sep 14, 2016

Snuffleupagus Sep 14, 2016

timvandermeij Sep 14, 2016 •

edited

Loading

Snuffleupagus Sep 14, 2016

Snuffleupagus commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

Snuffleupagus commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

timvandermeij commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

timvandermeij commented Sep 15, 2016

[EnhanceTextSelection] Make expandTextDivs more efficient by updating all styles at once instead of piecewise #7632

[EnhanceTextSelection] Make expandTextDivs more efficient by updating all styles at once instead of piecewise #7632

Conversation

Snuffleupagus commented Sep 14, 2016 • edited Loading

Snuffleupagus commented Sep 14, 2016

pdfjsbot commented Sep 14, 2016

From: Bot.io (Linux)

Received

pdfjsbot commented Sep 14, 2016

From: Bot.io (Windows)

Received

pdfjsbot commented Sep 14, 2016

From: Bot.io (Windows)

Success

pdfjsbot commented Sep 14, 2016

From: Bot.io (Linux)

Failed

Snuffleupagus Sep 14, 2016

Choose a reason for hiding this comment

timvandermeij Sep 14, 2016 • edited Loading

Choose a reason for hiding this comment

Snuffleupagus Sep 14, 2016

Choose a reason for hiding this comment

Snuffleupagus commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Received

pdfjsbot commented Sep 15, 2016

From: Bot.io (Windows)

Received

pdfjsbot commented Sep 15, 2016

From: Bot.io (Windows)

Success

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Failed

Snuffleupagus commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Received

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Success

Published

timvandermeij commented Sep 15, 2016

pdfjsbot commented Sep 15, 2016

From: Bot.io (Windows)

Received

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Received

pdfjsbot commented Sep 15, 2016

From: Bot.io (Windows)

Success

pdfjsbot commented Sep 15, 2016

From: Bot.io (Linux)

Success

timvandermeij commented Sep 15, 2016

[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise #7632

[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise #7632

Snuffleupagus commented Sep 14, 2016 •

edited

Loading

timvandermeij Sep 14, 2016 •

edited

Loading