Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Type1 font files to determine the various Length{n} properties, instead of trusting the PDF file (issue 5686, issue 3928) #7066

Merged
merged 2 commits into from
Mar 31, 2016

Conversation

Snuffleupagus
Copy link
Collaborator

  • Make Type1Font more class-like, by adding closure

    Note: Ignoring whitespace should simplify reviewing a great deal.
    Snuffleupagus@a7978cc?w=1

  • Parse Type1 font files to determine the various Length{n} properties, instead of trusting the PDF file (issue 5686, issue 3928)

    Please note: In order to save time, I've not yet included any reduced test-cases in this patch. I'm happy to include tests, but I'd like the patch to be reviewed first, so that I don't waste time creating reduced test-cases for a patch before knowing if it's acceptable.

    Fixes Squares instead of spaces #5686.
    Fixes Stops displaying the document, loops with 100% cpu power #3928.

@Snuffleupagus
Copy link
Collaborator Author

@brendandahl Can you please provide feedback on the general direction of this patch (so that I know if it's worthwhile to continue working on it)?

/botio-windows preview

@brendandahl
Copy link
Contributor

@brendandahl Can you please provide feedback on the general direction of this patch (so that I know if it's worthwhile to continue working on it)?

General direction looks good. It may be worth looking at freetype within pdfium to see how they handle this case as well.

@Snuffleupagus
Copy link
Collaborator Author

It may be worth looking at freetype within pdfium to see how they handle this case as well.

Disclamer: My knowledge of C isn't what it should be, so I might be misunderstanding the code, but from a cursory look through the t1parse.c source code it appears that FreeType:

  1. Parses the font file to locate the eexec string, to determine the length of the header block.
  2. Parses the rest of the font file, i.e. the eexec block, not caring about a potential fixed-content block.

Again, I might not understand the FreeType code correctly! But if the above is approximately correct, it looks to me that this PR is doing something that is at least similar.

@timvandermeij
Copy link
Contributor

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/697ac99d901da82/output.txt

@timvandermeij
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/b0ecdc4a095d159/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/58e31eff3d81724/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/b0ecdc4a095d159/output.txt

Total script time: 19.88 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/58e31eff3d81724/output.txt

Total script time: 23.76 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/58e31eff3d81724/reftest-analyzer.html#web=eq.log

@timvandermeij
Copy link
Contributor

/botio-linux test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/d12f941a2151932/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/d12f941a2151932/output.txt

Total script time: 21.89 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed


var actualLength = 0;
while (true) {
var scanBytes = stream.peekBytes(headerBytesLength);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

headerBytesLength is potentially undefined here, is that okay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's fine, since undefined just means that we'll get the entire stream. See the comment at https://github.com/mozilla/pdf.js/pull/7066/files#diff-d9f16d4d656cd2e962e0ad032dfba903R3681.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it seems kind of dumb to get the entire stream at once like that, since the header usually isn't that long compared to the total length of a Type1 font file.
So I've pushed a new version of the patch, that searches the stream in smaller chunks instead.

*Note:* Ignoring whitespace should simplify reviewing a great deal.
…s, instead of trusting the PDF file (issue 5686, issue 3928)

Fixes 5686.
Fixes 3928.
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/ca23ca71aa97fc8/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/167e8522b655cb7/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/167e8522b655cb7/output.txt

Total script time: 19.91 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/ca23ca71aa97fc8/output.txt

Total script time: 21.87 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@brendandahl brendandahl merged commit 8910cea into mozilla:master Mar 31, 2016
@Snuffleupagus Snuffleupagus deleted the Type1-headerBlockLength branch March 31, 2016 18:27
@Snuffleupagus Snuffleupagus restored the Type1-headerBlockLength branch April 1, 2016 08:40
@Snuffleupagus
Copy link
Collaborator Author

/botio makeref

@pdfjsbot
Copy link

pdfjsbot commented Apr 1, 2016

From: Bot.io (Linux)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/5ca7ff768090e79/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 1, 2016

From: Bot.io (Windows)


Received

Command cmd_makeref from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/2979a5d01841be9/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 1, 2016

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/2979a5d01841be9/output.txt

Total script time: 20.13 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

pdfjsbot commented Apr 1, 2016

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/5ca7ff768090e79/output.txt

Total script time: 21.55 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@Snuffleupagus Snuffleupagus deleted the Type1-headerBlockLength branch April 1, 2016 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants