-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow text to be selectable/findable #10
Comments
Definitively this is something we want to do, Chris Jones speak of some direction about that at the end of his blog post http://blog.mozilla.com/cjones/2011/06/15/overview-of-pdf-js-guts/ For the moment we're still learning things about PDF and looking for what's missing on the browser side and what existing technologies (such as SVG) can do about it. Nothing has been decided how the right way to implement the selection feature and we are are open to suggestions, even more opened to patches! :) Also words inside a PDF are chunks or letters, in order to implement a search/selection feature one needs to figure out an algorithm to rebuild the strings and determine which chunks lives together. On my side I'm busy working on fonts extraction of the document in order to render Type1 Fonts via @font-face (not natively supported by the browser) and doing rewrite on the fly of badly formed TrueType embed inside the pdf documents (in order to pass the fonts sanitizer of the browser...), bug I would be more than happy to provide directions to implement something or to discuss a solution. |
Basically, we have two options. Since (1) is less work for us, we're targeting that first. We'll have to see whether that works well enough for us to drop (2). There are probably many other ways to approach this problem. |
Selectable text prototype https://github.com/notmasteryet/pdf.js/tree/text-1 via div and no-color text. Uses mozCurrentTransform, so will work only with Beta, Aurora and NIghtly. Something to play with... |
(SVG prototype at https://github.com/andreasgal/pdf.js/issues/229#issuecomment-1651322) |
Added to Milestone. Who wants to get self-assigned to this issue? |
Text selection has been implemented. There's another open issue for text search (see #819). Closing, please reopen if we missed something. |
FIrst round of instructions generated from our artificial canvas context
…pageNumbers PR 7341 added special handling for `nameddest`s that look like pageNumbers, to prevent issues since we previously *incorrectly* supported specifying a pageNumber directly in the hash; i.e. `mozilla#10` versus the correct `#page=10` format. Since this behaviour wasn't correct, PR 7757 fixed and deprecated the old format, which means that we no longer need to maintain the `nameddest` hack in multiple files.
I'm sure this feature has been considered, but this library would be a magnitude cooler if the text in the PDF were interactive, that is, can be selected or traverses by the browser's find functionality.
I'm sure there are many reasons why the text should be embedded directly into the canvas (e.g. for layering), but could transparent text be layered on top of the canvas to allow it to still be selected? This text can be absolutely positioned and have a color of
rgba(0,0,0,0.0)
. See demo: http://jsfiddle.net/westonruter/UGZWE/The text was updated successfully, but these errors were encountered: