Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose fragment IDs in the URL #5258

Closed
BigBlueHat opened this issue Sep 3, 2014 · 10 comments
Closed

Expose fragment IDs in the URL #5258

BigBlueHat opened this issue Sep 3, 2014 · 10 comments

Comments

@BigBlueHat
Copy link

PDF.js uses fragment IDs for navigating within a PDF, but those fragment IDs are not exposed to the URL / browsing history. However, they are usable directly from the location bar and even work with the PDF viewer included in Chrome 36 (with the exclusion of page=123).

Having these in the browsing history would mean that one could easily create links into specific sections of a PDF.

Here are some example URL's pulled from the source code--which I'd much have preferred to crib from the location bar. 😸
PDF section identifier-based fragment ID (afaik):
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/pdf_reference_1-7.pdf#G13.1696671

PDF.js "proprietary" ?page=# style URL:
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/pdf_reference_1-7.pdf#page=607

Thanks for consider this idea! 💡

@Snuffleupagus
Copy link
Collaborator

As far as internal "links" in PDF files goes, you can just right-click them and easily copy the entire URL (the same applies for items in the "Document Outline").

If you want to get the current position in the document, for easy linking, there is a "Current view (...)"-button on the right-hand side of the toolbar.

@BigBlueHat
Copy link
Author

@Snuffleupagus thanks for the tips! Is there any reason, though, not to expose them in the location bar? It's equivalent to what's happening in the JS, just not exposed in the browser UI.

I'm also only thinking post-link-click exposure--not "as the page scrolls" exposure.

Essentially anything that currently adds to the browsing history should look like it did in the location bar.

Thanks!

@Snuffleupagus
Copy link
Collaborator

Is there any reason, though, not to expose them in the location bar?

I think that it's connected with the fact that "links" in PDF files doesn't work the same way as links on e.g. the web. "Links" in PDF files are really Destinations, which in this case can be of two different kinds:

  • Named Destinations, which is the kind you gave as an example above. For those it is probably not too difficult to update the URL hash when such a "link" is clicked. For more on these, see: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.1947713.
  • Explicit Destinations, which follow another format. Those cannot be used in the hash as is, and would thus need to be translated into a suitable format.
    To be able to at least show something in the hash when those "links" are hovered, we currently try to "convert" them to our #page=...&zoom=... format. But the current code for doing the conversion from Explicit Destination to URL hash is quite often not correct.
    That doesn't (currently) really matter too much though, since clicking these types of "links" always uses the Explicit Destination. But if we want to update the hash for those "links", we would need to ensure that there is a one-to-one correspondence between the Destination and what you see in the hash. (This might be difficult in general, I haven't really looked into it yet.)
    For more information about the format, see: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.1696125.

I'm also only thinking post-link-click exposure--not "as the page scrolls" exposure.

That's good! Because updating the history on every scroll event would probably be really bad for performance.

@BigBlueHat
Copy link
Author

@Snuffleupagus thanks for the detailed reply!

Named Destinations was the one I'd been expecting to update the URL, but it didn't.

However, I did some more digging and found RFC3778 Section 3 which defines Fragment Identifiers for the application/pdf Media Type.

It links to an Adobe technical note, Parameters for Opening PDF Files. In which is a Specifying parameters in a URL and a list of available parameters.

It does, however, drive off the map a bit and say that either & or # can be used for delimiters--which would make them invalid (according to RFC3986 anyhow) URLs. But disregarding that, it's a pretty fabulous map for adding these to URLs. 😄

Would be great to see them all added, but for now Named Destinations support alone (which apparently--according to the URL Examples--can go directly after # or at #nameddest=...which is far uglier) would be fabulous!

Thanks for digging into this issue @Snuffleupagus!

@Snuffleupagus
Copy link
Collaborator

Given that an average user most likely doesn't care if a "link" is of the Named Destination or Explicit Destination kind, in my opinion we really need to be consistent here.
If only some "links" updated the hash, we are sure to get issues/bugs filed about that inconsistency.

@BigBlueHat
Copy link
Author

Agreed. If it updates the navigation history, the URL should change. The fragment identifier should match what's been specified in the PDF docs. Guess I was just after some of this sooner than later. 😉 Good call, though.

@andrenarchy
Copy link

I proposed to place a div for each named destination at the corresponding position in #5892 where the id attribute is set based on the name. Links could then simply point to these divs via the hash part of the URL. What do you think?

@Snuffleupagus
Copy link
Collaborator

@andrenarchy As explained in #5258 (comment), HTML links are not identical to the destinations used in PDF files.
First of all, the PDF specification supports both "named" and "explicit" destinations, please refer to the links in #5258 (comment), and both kinds are used in practice.
Secondly both kind of destinations can also, apart from specifying a page/position, set the zoom level of the document (both directly and indirectly, again see the links above).
I hope that the above is enough to explain why implementing what you are suggest is unfortunately not going to be possible.

Finally, in PDF.js we only render the currently visible pages, for performance and memory reasons. If your suggestion was implemented, that would mean that links wouldn't work unless the page had already been rendered.

@andrenarchy
Copy link

Thx @Snuffleupagus for your explanation! Now I understand why my proposal isn't a good idea.

I'll try to see why links don't work in my viewer-based application. Probably only the scrolling part doesn't work...

@Snuffleupagus
Copy link
Collaborator

@timvandermeij Let's just close this as a duplicate of #5753, since these issues are similar enough and the (optional) functionality from PR #10423 is the only thing we could/should do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants