-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EnhanceTextSelection] Make expandTextDivs
more efficient by updating all styles at once instead of piecewise
#7632
[EnhanceTextSelection] Make expandTextDivs
more efficient by updating all styles at once instead of piecewise
#7632
Conversation
/botio test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/ef403cd52fad547/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.22.172.223:8877/e96954c98dd3cb6/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/e96954c98dd3cb6/output.txt Total script time: 24.43 mins
|
From: Bot.io (Linux)FailedFull output at http://107.21.233.14:8877/ef403cd52fad547/output.txt Total script time: 39.30 mins
Image differences available at: http://107.21.233.14:8877/ef403cd52fad547/reftest-analyzer.html#web=eq.log |
textDiv.style.fontFamily = style.fontFamily; | ||
textDivProperties.style = 'left: ' + left + 'px; top: ' + top + | ||
'px; font-size: ' + fontHeight + | ||
'px; font-family: ' + style.fontFamily + ';'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we might actually want to do something similar to PR #5033 for this string!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm fine with that, as long as we keep a similar comment above it (since otherwise it may look like magic that may be removed later on).
Maybe you could check how often this method is called for the one-glyph-per-div case to determine whether we actually need this optimization or not? If it's called over a thousand times, it should be useful, while otherwise it may be unnecessary. I'm fine with both ways, so let's pick the one you think is most useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a good case, such as the tracemonkey paper, there seem to be around 150 divs per page. So even for well behaved documents, it might make sense to avoid these intermediate strings when considering an entire document.
For a really bad case, such as the PDF from issue 7224, there's around 2000 divs per page, and in this case it probably makes a lot of sense to avoid intermediate strings.
I'll update the code, and add a comment about why we're doing this.
…ng all styles at once instead of piecewise I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up. It appears that there's a lot of, i.e. way too much, variance between subsequent runs of `text` tests for the results to be meaningful. (Previously I've only benchmarked `eq` tests, so I don't know if the `text` tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the current `master` with a *very* simple manifest file: [link here].) Instead I used `console.time/timeEnd` in `appendText` and `expandTextDivs` to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: [link here]. Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on). However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks *very* likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224). Re: issue 7584.
Comments have been addressed. /botio test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/95a4df768e50969/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.22.172.223:8877/c56b1885962fbb2/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/c56b1885962fbb2/output.txt Total script time: 24.59 mins
|
From: Bot.io (Linux)FailedFull output at http://107.21.233.14:8877/95a4df768e50969/output.txt Total script time: 39.52 mins
Image differences available at: http://107.21.233.14:8877/95a4df768e50969/reftest-analyzer.html#web=eq.log |
/botio-linux preview |
From: Bot.io (Linux)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/2e4660abe7a0f40/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/2e4660abe7a0f40/output.txt Total script time: 1.32 mins Published |
/botio makeref |
From: Bot.io (Windows)ReceivedCommand cmd_makeref from @timvandermeij received. Current queue size: 0 Live output at: http://107.22.172.223:8877/8ec5f6a456954e4/output.txt |
From: Bot.io (Linux)ReceivedCommand cmd_makeref from @timvandermeij received. Current queue size: 0 Live output at: http://107.21.233.14:8877/533677ffa739ff2/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/8ec5f6a456954e4/output.txt Total script time: 24.27 mins
|
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/533677ffa739ff2/output.txt Total script time: 38.15 mins
|
Awesome work! |
…andTextDivs [EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise
I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up.
It appears that there's a lot of, i.e. way too much, variance between subsequent runs of
text
tests for the results to be meaningful.(Previously I've only benchmarked
eq
tests, so I don't know if thetext
tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the currentmaster
with a very simple manifest file: https://gist.github.com/Snuffleupagus/eec3acc22f519eddce6404be883b6960.)Instead I used
console.time/timeEnd
inappendText
andexpandTextDivs
to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: https://gist.github.com/Snuffleupagus/33fc971b653e6524bc889216ff95499c.Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on).
However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks very likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224).
Re: issue #7584.
/cc @timvandermeij
This change is