Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the previous history rewriting logic #6200

Merged
merged 1 commit into from
Sep 27, 2015

Conversation

Rob--W
Copy link
Member

@Rob--W Rob--W commented Jul 12, 2015

When the user edits the URL and changes the reference fragment (hash), PDF.js intercepts this action, and saves the then-current history state in the previous history entry. This is implemented by navigating back, editing the history and navigating forward again.

The current logic has a flaw: It assumes that calling history.back() and history.forward() immediately updates the history state. This is however not guaranteed by the web standards, which states that calling e.g. history.back "must traverse the history by a delta -1", which means that the browser must QUEUE a task to traverse the session history, per spec (http://w3.org/TR/2011/WD-html5-20110113/history.html#dom-history-back and https://html.spec.whatwg.org/multipage/browsers.html#dom-history-back).

Firefox and Internet Explorer deviate from the standards by immediately changing the history state instead of queuing the navigation. WebKit derived browsers (Chrome, Opera, Safari) and Opera presto do not.

The user-visible consequence of strictly adhering to the standards in PDF.js can be shown as follows:

  1. Edit the URL
  2. Append #page=2 for example.
  3. Press Enter.
    -> Presto and WebKit: PDF.js reverts to the previous URL.
    -> Gecko and Trident: PDF.js keeps the new URL, as expected.

To fix the issue, modification of the previous history item happens in a few asynchronous steps, guided by the popstate event to detect when the history navigation request has been committed.

(I have also put the above message in the commit message together with some additional notes about the implementation).

Assigned to @Snuffleupagus for review because you introduced the logic in 766b92c

@Snuffleupagus
Copy link
Collaborator

Unfortunately I'm not able to verify this patch.
I've tested it in Google Chrome 44.0.2403.81 beta-m on Windows, both locally (node make server), and by building/installing the extension (node make chromium), and I get the same (broken) behaviour with and without this patch.

@Rob--W
Copy link
Member Author

Rob--W commented Jul 14, 2015

@Snuffleupagus Oops. It seems that oldURL contains the hash and newURL does not. I'll dig further, thanks for the review.

@Snuffleupagus
Copy link
Collaborator

Actually, if you're willing to live with this issue for just a little while longer, it might actually be fixed "for free" by the PDFHistory rewrite that I'm working on.

Note that the current PDFHistory is an unmaintainable mess that has caused a lot of subtle bugs, which is why I'm rewriting it from scratch. The new implementation is basically complete, I just need time to test it more thoroughly before submitting a PR.

@Rob--W
Copy link
Member Author

Rob--W commented Jul 14, 2015

I think that this bug can be attributed to a browser bug. Here's a reduced steps-to-reproduce:

  1. Visit https://example.com
  2. Edit the URL and append #foo
  3. Open the console and run history.back();history.forward();
    -> Expected result: Chrome should be at #foo.
    -> Actual result: The history.forward call is ignored, Chrome is at / (without hash, and the back/forward button show that there's clearly an entry in the history).

(sanity check:)

  1. Navigate forwards in the history.
  2. Type history.back in the console.
    -> Result: Chrome navigates back to the URL without #.
  3. Type history.forward()
    -> Result: Chrome navigates forward to #foo

(exchanging forward and back produces similar results. It looks like a bug in Chrome's history.back/forward implementation).

EDIT: Queing the .forward call in the event loop appears to work around the issues (i.e. wrapping the second history call in setTimeout).

@Rob--W
Copy link
Member Author

Rob--W commented Jul 14, 2015

@timvandermeij
Copy link
Contributor

Where does this leave the PR? If this is in fact a browser bug, should we attempt to work around it here or should we just file a tracking issue for the WebKit bug?

@Rob--W
Copy link
Member Author

Rob--W commented Jul 14, 2015

@timvandermeij I've read the spec, and it doesn't explicitly state that the history has to immediately be updated after calling history. Actually, in the way that it's worded, it seems that it's perfectly fine that history.back and history.forward do not immediately update the history.

Because of this, and because this issue affects all WebKit-derived browsers, I'll edit this PR with a work-around to this issue. I'll ping you both again when the PR has been updated.

Here are the relevant portions of the spec:

https://html.spec.whatwg.org/multipage/browsers.html#dom-history-back
When the back() method is invoked, the user agent must traverse the history by a delta −1.

and

https://html.spec.whatwg.org/multipage/browsers.html#traverse-the-history-by-a-delta
To traverse the history by a delta delta, the user agent must append a task to this top-level browsing context's session history traversal queue, the task consisting of running the following steps: (...)

(queuing a task in the event loop means that the task does not have to be executed immediately. Think of queuing as wrapping setTimeout around the call)

@timvandermeij
Copy link
Contributor

Sounds good. Thank you for looking into this. I'm marking this as work in progress for now until it is updated.

@Rob--W Rob--W force-pushed the hashchange-use-newURL branch from fcb7f61 to 6da37b8 Compare July 14, 2015 22:38
@Rob--W Rob--W changed the title Use hashchange event's newURL instead of location Refactor the previous history rewriting logic Jul 14, 2015
@Rob--W
Copy link
Member Author

Rob--W commented Jul 14, 2015

@timvandermeij @Snuffleupagus Please take a look at the updated PR.

@Snuffleupagus
Copy link
Collaborator

@Rob--W I'm sorry for the delay in getting back to you, but I've been quite busy recently.

Re: #6272 (comment)
After quickly testing this PR again, I'm not able to find the issues that I thought I'd seen before. So most likely I'm simply remembering wrong, since it's well over a month since I tested the last version of this PR.


However, I do feel that this PR makes the code more complicated. And since it'd like to get rid of this special case anyway (in PR #6272), I'm wondering if we couldn't just remove these lines instead: https://github.com/mozilla/pdf.js/blob/master/web/pdf_history.js#L91-L104.
That would be a lot simpler to test, and we'd basically be guaranteed that this PR won't cause any issues.

@Rob--W
Copy link
Member Author

Rob--W commented Sep 23, 2015

@Snuffleupagus

@Rob--W I'm sorry for the delay in getting back to you, but I've been quite busy recently.

No problem!

However, I do feel that this PR makes the code more complicated.

In my opinion, the existing code was complicated (it took some efforts to follow the existing logic), and the refactoring into smaller, focused and documented functions enhance the readability / maintainability of this logic.

And since it'd like to get rid of this special case anyway (in PR #6272), I'm wondering if we couldn't just remove these lines instead: https://github.com/mozilla/pdf.js/blob/master/web/pdf_history.js#L91-L104.

Possibly yes, but that still needs some review and testing, which may take a while.
But for this PR, I'm certain that it fixes an annoying issue (without introducing regressions).

I think that a possible concern with merging this patch is that your work at #6272 cannot immediately be merged, but that can simply be solved by adding a git revert at the start of your PR.

All together, I think that merging this patch has more advantages than disadvantages (to be precise: no advantages for Firefox, known advantages for WebKit/Blink-based browsers, no known disadvantages). If this patch doesn't get merged, I'll include the patch on top of the chrome extension branch (to fix the bug for 500K users). That act would not resolve the bug for users who put PDF.js on their website though.

if (!self.historyUnlocked) {
return;
}
if (evt.state) {
if (self._isStateObjectDefined(evt.state)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change, since there's an edge-case where it can can have undesired effects.
In the GENERIC viewer (which support opening multiple PDF files), evt.state may refer to a previously opened PDF file. In that case, nothing should be done; note how _goTo returns early for this case (given self._isStateObjectDefined(evt.state) == false).

If this is changed, we could incorrectly reach the code below, which should only happen when the user modifies the hash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done!

@Snuffleupagus
Copy link
Collaborator

In my opinion, the existing code was complicated (it took some efforts to follow the existing logic), and the refactoring into smaller, focused and documented functions enhance the readability / maintainability of this logic.

Considering that this code is only relevant for in an edge case, I won't fight you on this; let's just agree to disagree :-)
If you address my review comment above, which prevents issues in an edge case, I'll land this PR!

I think that a possible concern with merging this patch is that your work at #6272 cannot immediately be merged, but that can simply be solved by adding a git revert at the start of your PR.

I'm quite used to having to rebase around various merge conflicts, so I did not in any way consider that to be a blocker!

When the user edits the URL and changes the reference fragment (hash),
PDF.js intercepts this action, and saves the then-current history state
in the previous history entry. This is implemented by navigating back,
editing the history and navigating forward again.

The current logic has a flaw: It assumes that calling history.back() and
history.forward() immediately updates the history state. This is however
not guaranteed by the web standards, which states that calling e.g.
history.back "must traverse the history by a delta -1", which means that
the browser must QUEUE a task to traverse the session history, per spec:
http://w3.org/TR/2011/WD-html5-20110113/history.html#dom-history-back
https://html.spec.whatwg.org/multipage/browsers.html#dom-history-back

Firefox and Internet Explorer deviate from the standards by immediately
changing the history state instead of queuing the navigation.
WebKit derived browsers (Chrome, Opera, Safari) and Opera presto do not.

The user-visible consequence of strictly adhering to the standards in
PDF.js can be shown as follows:

1. Edit the URL.
2. Append #page=2 for example.
3. Press Enter.
   -> Presto and WebKit: PDF.js reverts to the previous URL.
   -> Gecko and Trident: PDF.js keeps the new URL, as expected.

To fix the issue, modification of the previous history item happens in
a few asynchronous steps, guided by the popstate event to detect when
the history navigation request has been committed.

--
Some more implementation notes:

I have removed the preventDefault and stopPropagation calls, because
popstate is not cancelable, and window is already the last target of the
event propagation.

The previous allowHashChange logic was hard to follow, because it did
not explain that hashchange will be called twice; once during the
popstate handler for history.back() (which will reset allowHashChange),
and again for history.forward() (where allowHashChange will be false).
The purpose of allowHashChange is now more explicit, by incorporating
the logic in the replacePreviousHistoryState helper function.
@Rob--W Rob--W force-pushed the hashchange-use-newURL branch from 6da37b8 to cdea75d Compare September 26, 2015 21:17
Snuffleupagus added a commit that referenced this pull request Sep 27, 2015
Refactor the previous history rewriting logic
@Snuffleupagus Snuffleupagus merged commit 7a8b0fb into mozilla:master Sep 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants