Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) #8381

Merged
merged 1 commit into from
May 7, 2017
Merged

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented May 5, 2017

This patch contains the following improvements:

  • Only fetch the various document properties once per PDF file opened, and cache the result (in a frozen object).
  • Always update the entire dialog at once, to prevent inconsistent UI state (issue 8371).
  • Ensure that the dialog, and all its internal properties, are reset when PDFViewerApplication.close is called.
  • Inline, and re-factor, the getProperties method in open, since that's the only call-site.
  • Always overwrite the fileSize with the value obtained from pdfDocument.getDownloadInfo, to ensure that it's correct.
  • ES6-ify the code that's touched in this patch.

Fixes #8371.

@Snuffleupagus Snuffleupagus changed the title Re-factor PDFDocumentProperties such it's properly reset when a new PDF file is opened (issue 8371) Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) May 5, 2017
@timvandermeij
Copy link
Contributor

/botio-linux preview

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this patch looks good, but I've got some minor questions.

OverlayManager.register(this.overlayName, this.container,
this.close.bind(this));
}

/**
* Open the document properties overlay.
*
* Note that since the document properties are fetched asynchronously,
* the dialog will *only* be updated if it's still open when the data arrives.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here, so maybe this needs to be rephrased a bit. If the dialog is closed (not visible), then it will still be updated in the background for the next time you open it. Open https://mozilla.github.io/pdf.js/web/viewer.html, wait two seconds and open the document properties and you'll immediately see the data. Or is that not what you meant here?

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dialog is closed (not visible), then it will still be updated in the background for the next time you open it.

That should be the opposite of what the code does, both currently and with this patch.
Since we always re-fetch the data, and update the dialog every time it's opened, there's no point to updating anything if the dialog has been closed by the time e.g. this.pdfDocument.getMetadata() is resolved. (Keep in mind that all the operations involved here are asynchronous.)

So basically, I was trying to provide a general comment for these checks:

if (OverlayManager.active !== this.overlayName) {
  return;
}

Open https://mozilla.github.io/pdf.js/web/viewer.html, wait two seconds and open the document properties and you'll immediately see the data.

Noting gets loaded until the dialog is successfully opened, so what you describe as "immediately" just seem instantaneous since two seconds after loading the tracemonkey.pdf file, there's probably not many heavy operations still occurring. So the asynchronous fetching of the various document properties thus becomes very fast.

Or is that not what you meant here?

No, that's not really what I meant :-)
So any suggestions on making this easier to understand would be helpful!
(I could also move, or even just remove, the comment if it doesn't help make sense of the code.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I understand what you mean now. The comment itself looks clear after reading it again with this in mind, but I would indeed suggest to move it above the first OverlayManager check as the comment describes why we need such a check. Other than moving it, the wording seems fine to me.

if (OverlayManager.active !== this.overlayName) {
return;
}
let content = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a more general question and not completely related to this patch per se, but what is the advantage of using let here over regular var? As I understood it, let is useful for block scoping, but is that necessary here?

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood it, let is useful for block scoping, but is that necessary here?

It's not technically necessary here, but you could also flip the question on its head and ask: "Why would you (generally) use var when let is now is a thing?" :-)
Basically, using let can actually help avoid some nasty bugs[1] related to e.g. hoisting of var.[2]

There's a lot of useful information in this MDN article: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/let.


[1] See e.g. #8355 (comment), which shows how var can easily do something other than what you'd expect/intent.
[2] There's obviously cases where you want var specifically, but in most cases it doesn't matter, and then I think I'd favour let in new code. (The general Firefox frontend code-base also seem to be trending in this general direction, unless I'm not mistaken.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. That article is indeed really helpful and I see why let can be preferred over var now, even though it's not strictly necessary in this case. All is fine here then, thanks!

@@ -87,71 +113,71 @@ class PDFDocumentProperties {
* @param {Object} pdfDocument - A reference to the PDF document.
* @param {string} url - The URL of the document.
*/
setDocumentAndUrl(pdfDocument, url) {
setDocument(pdfDocument, url) {
if (this.pdfDocument) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no real need to check for this here since this._reset() does the check too.

Moreover, shouldn't this.pdfDocument be private as well, i.e., this._pdfDocument, as it may only be set using this setter (like this._url)?

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no real need to check for this here since this._reset() does the check too.

There's a very good reason, and it's that usually setFileSize is called before setDocument. If we'd just unconditionally call reset here, then that would complete remove the point of setFileSize.
Maybe some sort of comment might be in order here!?

Edit: Also, note how in other files, e.g. PDFViewer.setDocument, we check this.pdfDocument. So I suppose that if we should get rid of anything here, that ought to be the check in _reset() instead!?
The reason for the additional check in _reset(), was to avoid updating the UI when initializing the class (don't know if this is just a needless micro-optimization).

Moreover, shouldn't this.pdfDocument be private as well, i.e., this._pdfDocument, as it may only be set using this setter (like this._url)?

Hmm, I'm guessing that I probably didn't change that since this.pdfDocument is used in a lot of web/ files. Actually, I'm wondering if I should simply revert the naming in this patch to its prior state, since I'm not sure if marking these as "private" properties really adds anything?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks are fine then, but a small comment could indeed be helpful here to avoid any confusion.

I'm fine with reverting the naming changes. It will make the diff smaller and for now it does not add much. We can change all of them at a later time if we want, but then we should indeed do all of them instead of just a few.

if (this.pdfDocument) {
// Reset the dialog when a new PDF file is opened (issue 8371).
for (let id in this.fields) {
this._updateField(this.fields[id]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be this._clearField(this.fields[id]) or this._updateField(this.fields[id], DEFAULT_FIELD_CONTENT) instead because using the same method for both updating and clearing seems a bit counterintuitive given the name and the signature of the method. In this case, it's not entirely clear to me from reading this line that it's performing a reset.

Moreover, shouldn't this.fields be private as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this._updateField(this.fields[id], DEFAULT_FIELD_CONTENT)

That'd be completely redundant, since _updateField will always fallback to DEFAULT_FIELD_CONTENT when no (valid) content is provided (there's a comment to that effect in the method).
Having two different methods for this just seem unnecessary to me; would a better comment help here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, but to me it just looked strange to update a field without providing a value. The fallback is implicit if you just read that line without going to the actual updateField method. Adding a comment above this line saying that the fallback value is used will indeed help to avoid confusion when reading this file.

@Snuffleupagus
Copy link
Collaborator Author

Snuffleupagus commented May 6, 2017

@timvandermeij I've just pushed a new version of the patch, which re-works quite a bit more of this code.

Please take a look at the updated #8381 (comment), where the first two points are a significant change from the first version of the patch (and the current master as well).

I've tried to take your questions/comments into account when writing, and commenting, the new code.
Unfortunately this largely invalidates your previous review, for which I apologise, but I hope that this new approach makes more sense!

@timvandermeij
Copy link
Contributor

This new version looks really good and appears to resolve my comments. I'll do a full review later today.

@timvandermeij
Copy link
Contributor

/botio-linux preview

@Snuffleupagus Snuffleupagus changed the title Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) [WIP] Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) May 6, 2017
@Snuffleupagus Snuffleupagus changed the title [WIP] Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) Re-factor PDFDocumentProperties such that it's properly reset when a new PDF file is opened (issue 8371) May 7, 2017
…a new PDF file is opened (issue 8371)

This patch contains the following improvements:
 - Only fetch the various document properties *once* per PDF file opened, and cache the result (in a frozen object).
 - Always update the *entire* dialog at once, to prevent inconsistent UI state (issue 8371).
 - Ensure that the dialog, and all its internal properties, are reset when `PDFViewerApplication.close` is called.
 - Inline, and re-factor, the `getProperties` method in `open`, since that's the only call-site.
 - Always overwrite the fileSize with the value obtained from `pdfDocument.getDownloadInfo`, to ensure that it's correct.
 - ES6-ify the code that's touched in this patch.

Fixes 8371.
@Snuffleupagus
Copy link
Collaborator Author

@timvandermeij Sorry about the back and forth in this patch, but I realized that there were a number of opportunities to clean-up the code I've written.
I think that I'm now finally happy with the patch, since I've been able to simplify the code quite a bit compared to prior versions. Notably, the diff size for pdf_document_properties.js is almost cut in half.

@pdfjsbot
Copy link

pdfjsbot commented May 7, 2017

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/8c131461e2f3e5c/output.txt

@pdfjsbot
Copy link

pdfjsbot commented May 7, 2017

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/8c131461e2f3e5c/output.txt

Total script time: 2.01 mins

Published

@timvandermeij timvandermeij merged commit b3e4361 into mozilla:master May 7, 2017
@timvandermeij
Copy link
Contributor

Nice improvement; thank you for fixing and simplifying this!

@timvandermeij timvandermeij removed the request for review from yurydelendik May 7, 2017 15:44
@Snuffleupagus Snuffleupagus deleted the document-properties-reset branch May 7, 2017 19:02
movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018
…s-reset

Re-factor `PDFDocumentProperties` such that it's properly reset when a new PDF file is opened (issue 8371)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong Title in document properties
3 participants