Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change getPDFFileNameFromURL to ignore data: URLs for performance reasons (issue 8263) #8321

Merged
merged 3 commits into from
Apr 20, 2017
Merged

Conversation

Snuffleupagus
Copy link
Collaborator

Please refer to the individual commit messages.

Tentatively fixes #8263.

…nce the `URL` polyfill have made them redundant

Also, this changes `createBlob` to throw when `Blob` isn't supported.
@Snuffleupagus Snuffleupagus changed the title Change getPDFFileNameFromURL to ignore data: URLs for performance… … reasons (issue 8263) Change getPDFFileNameFromURL to ignore data: URLs for performance reasons (issue 8263) Apr 20, 2017
@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/796b50f18362b4d/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b4df5fd0700655b/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/796b50f18362b4d/output.txt

Total script time: 3.04 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/b4df5fd0700655b/output.txt

Total script time: 6.55 mins

  • Unit Tests: Passed

Copy link
Contributor

@yurydelendik yurydelendik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the change below

web/ui_utils.js Outdated
if (typeof defaultFilename === 'undefined') {
defaultFilename = 'document.pdf';
function getPDFFileNameFromURL(url, defaultFilename = 'document.pdf') {
if (url.indexOf('data:') === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per https://tools.ietf.org/html/rfc3986#section-3.1 , schemes are case-insensitive. I recommend to create the isDataSchema(url) function above and also skip allowed whitespaces, something like:

var i = 0; while (i < url.length && url[i].trim() == '') i++;
return url.substr(i, 5).toLowerCase() === 'data:';

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus Apr 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review!
I've made the requested changes, and also added a few unit-tests for the whitespace case.

… reasons (issue 8263)

The patch also changes the `defaultFilename` to use the ES6 default parameter notation, and fixes the formatting of the JSDoc comment.

Finally, since `getPDFFileNameFromURL` currently has no unit-tests, a few basic ones are added to avoid regressions.
…er, instead of downloading them, when `PDFJS.disableCreateObjectURL = false`

This prevents issues with the filename detection being skipped, when trying to download the opened PDF attachment, since `getPDFFileNameFromURL` ignores `data:` URLs for performance reasons.
@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/458d3baa21701a5/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/f9ccb988b4e6a21/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/458d3baa21701a5/output.txt

Total script time: 2.98 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/f9ccb988b4e6a21/output.txt

Total script time: 6.62 mins

  • Unit Tests: Passed

@Snuffleupagus Snuffleupagus merged commit c44fd3d into mozilla:master Apr 20, 2017
@Snuffleupagus Snuffleupagus deleted the issue-8263 branch April 20, 2017 17:46
movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018
Change `getPDFFileNameFromURL` to ignore `data:` URLs for performance reasons (issue 8263)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't guess file name from non-http/file URLs
3 participants