-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove PDFPageView.updatePosition
since it's not actually necessary
#8531
Conversation
This method is currently called from `PDFViewer._scrollUpdate` on *every* scroll event in the viewer. However, I cannot see why this code is now necessary (assuming that it once was), since text-selection and searching still works *exactly* the same way with this patch as with the current `master`. When `PDFPageView.updatePosition` is called, the page can be in either of these states: 1. The page hasn't been rendered, in which case the `textLayer` property doesn't exist yet. 2. The page is currently rendering, meaning that the `textLayer` property exists. Given that the `textContent` won't be fetched until the page has been successfully rendered, `TextLayerBuilder.render` will return immediately and effectively be a no-op (since there's nothing to render yet). 3. The has been been rendered, and the `textLayer` is currently rendering. 4. The page, and its `textLayer`, has been completely rendered. In this case, `TextLayerBuilder.render` will return immediately and effectively be a no-op. Here, only the *third* case seem to require any further analysis: When scrolling occurs while the `textLayer` is rendering, `TextLayerBuilder.render` will via a helper method call `TextLayerRenderTask.cancel` (in src/display/text_layer.js) to stop processing. However, due to the run-to-completion nature of JavaScript, once `TextLayerRenderTask._render` has been invoked `appendText` will always run.[1] So even though we cancel rendering of pending `textLayer`s during scrolling, via the repeated `TextLayerBuilder.render` calls from within the `PDFPageView.updatePosition` method, that does *not* prevent us from running the code inside of `TextLayerRenderTask._render` over and over for the *same* page; which all seems *very* inefficient to me.[2] All this will thus have the effect of delaying the *actual* rendering of a `textLayer` ever so slightly while scrolling in the viewer. However, it does so at the expense of potentially hundreds of unnecessary `appendText` calls.[3] Hence it seems to me that it's less resource intensive overall to simply let rendering of the `textLayer` complete, once it has started. Obviously, we still abort all rendering of a page, and its `textLayer`, when it's being destroyed (e.g. by being evicted from the page cache). In case that there's any worry that the patch could affect e.g. highlighting of search results, please note that the existing code in `TextLayerBuilder.render` already calls `updateMatches` when the `TextLayerTask` resolves successfully. *I'm sorry that this became quite long, but to try and summarize:* `PDFPageView.updatePosition` doesn't actually do anything in *most* cases. In the one case where it matters, it seems that it's actually doing more harm than good; which is why I'm proposing that we just remove it. --- [1] Although we may be able to skip the `render` call, provided that it happens *after* a `timeout` (as is the case in the default viewer). [2] With current work being done to support streaming of `TextContent`, we'd also need to add just as many duplicate API calls to `PDFPageView.updatePosition`. [3] The number of duplicate `appendText` calls is directly proportional not only to the scroll speed, but also to the number of pages in the document.
/botio-windows preview |
From: Bot.io (Windows)ReceivedCommand cmd_preview from @yurydelendik received. Current queue size: 0 Live output at: http://54.215.176.217:8877/4ae586d55906d3b/output.txt |
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/4ae586d55906d3b/output.txt Total script time: 19.48 mins Published |
I was looking to the old code (fcdf266) and I think this is the right move now. It was a problem back then due to non-optimized text layer building and lots of divs. If we will feel we regress div heavy text layers, we will add "reset timeout" functionality to match the behavior of the original 5-year old commit. |
Thank you for the patch. |
Remove `PDFPageView.updatePosition` since it's not actually necessary
This method is currently called from
PDFViewer._scrollUpdate
on every scroll event in the viewer.However, I cannot see why this code is now necessary (assuming that it once was), since text-selection and searching still works exactly the same way with this patch as with the current
master
.When
PDFPageView.updatePosition
is called, the page can be in either of these states:textLayer
property doesn't exist yet.textLayer
property exists. Given that thetextContent
won't be fetched until the page has been successfully rendered,TextLayerBuilder.render
will return immediately and effectively be a no-op (since there's nothing to render yet).textLayer
is currently rendering.textLayer
, has been completely rendered. In this case,TextLayerBuilder.render
will return immediately and effectively be a no-op.Here, only the third case seem to require any further analysis:
When scrolling occurs while the
textLayer
is rendering,TextLayerBuilder.render
will via a helper method callTextLayerRenderTask.cancel
(in src/display/text_layer.js) to stop processing.However, due to the run-to-completion nature of JavaScript, once
TextLayerRenderTask._render
has been invokedappendText
will always run.[1]So even though we cancel rendering of pending
textLayer
s during scrolling, via the repeatedTextLayerBuilder.render
calls from within thePDFPageView.updatePosition
method, that does not prevent us from running the code inside ofTextLayerRenderTask._render
over and over for the same page; which all seems very inefficient to me.[2]All this will thus have the effect of delaying the actual rendering of a
textLayer
ever so slightly while scrolling in the viewer. However, it does so at the expense of potentially hundreds of unnecessaryappendText
calls.[3]Hence it seems to me that it's less resource intensive overall to simply let rendering of the
textLayer
complete, once it has started. Obviously, we still abort all rendering of a page, and itstextLayer
, when it's being destroyed (e.g. by being evicted from the page cache).In case that there's any worry that the patch could affect e.g. highlighting of search results, please note that the existing code in
TextLayerBuilder.render
already callsupdateMatches
when theTextLayerTask
resolves successfully.I'm sorry that this became quite long, but to try and summarize:
PDFPageView.updatePosition
doesn't actually do anything in most cases. In the one case where it matters, it seems that it's actually doing more harm than good; which is why I'm proposing that we just remove it.[1] Although we may be able to skip the
render
call, provided that it happens after atimeout
(as is the case in the default viewer).[2] With current work being done to support streaming of
TextContent
, we'd also need to add just as many duplicate API calls toPDFPageView.updatePosition
.[3] The number of duplicate
appendText
calls is directly proportional not only to the scroll speed, but also to the number of pages in the document.