Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less color conversion #4528

Merged
merged 2 commits into from
Apr 28, 2014
Merged

Conversation

bthorben
Copy link
Contributor

This moves the colour conversion to jpg.js

  • The good news: Benchmarking shows that this improves performance for the invitation document
    from This PDF takes almost a full minute to load completely #3809 by 25%
  • There could be problems with just outputting when you find a JPEG. I would appreciate if you could have a look and tell me what you think
  • It would probably make sense to move most colorspace conversions into the decoders. This raises questions of the structure of thing ...

@yurydelendik
Copy link
Contributor

Can we ask about raw data from the jpgjs in interleaved form and then apply (additional) colorspaces in the pdf.js, so we don't duplicate the code?

@yurydelendik
Copy link
Contributor

... this optimizations will apply not only for jpeg, but for any colorspace users. We can add Ycc and Ycck if needed.

@bthorben
Copy link
Contributor Author

Depending on what you mean by "raw data", this could be slightly less efficient, As the code looks right now, we convert from YCCK to RGB with intermediate CMYK, but you could actually do it all in one. So far this just eliminates creating another big buffer and touching the data again. I just hadn't have enough time to do it. As I understood it, the versions of YCC and YCCK that JPEG uses are special, so there is also not so much use in having it outside of jpg.js anyway I guess.

But how to make the code efficient and nice is my biggest concern. Long term, I think it's best if all image decoders output RGB(a) directly.

@yurydelendik
Copy link
Contributor

the "raw data" in this case the array returned by _getLinearizedBlockData()

@bthorben
Copy link
Contributor Author

I agree that it's better to not copy the code and optimise it for the benefit of all its users, but as it currently stands, it's either partly copying or making it nice and efficient absolutely everywhere, which requires a lot of work and testing. I think refactoring the code and moving all the color conversion in an efficient way to one place should be another task.

In conclusion: I would leave this efficiency improvement like this for now and mark that we should work on refactoring color conversion

@bthorben
Copy link
Contributor Author

This changed the formulas for converting YCCK to CMYK again slightly. There should be no difference to the naked eye, but the following tests will show minor differences:
issue3879, issue1096, issue1133, issue2006

@bthorben
Copy link
Contributor Author

bthorben commented Apr 4, 2014

The last commit showed me another thing we should discuss:

Before this PR, the image was decoded in the original size and colorspace, and then later transferred to destination colorspace and size. That would make only sense if we cache the decoding result and use it again for transfer to different destinations. I have seen no relevant caching anywhere, so I assume there is none, is that correct?

In conclusion: Unless there will be caching, we should always transfer to destination.

@yurydelendik
Copy link
Contributor

Okay, let's do this add a comment after copyright that we (Mozilla Foundation) modified original jpgjs. Also, do we need copyToImageData?

/botio-linux test

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2014

From: Bot.io (Linux)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://107.21.233.14:8877/5b01d74231c3aad/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 9, 2014

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/5b01d74231c3aad/output.txt

Total script time: 25.59 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/5b01d74231c3aad/reftest-analyzer.html#web=eq.log

@timvandermeij
Copy link
Contributor

@bthorben Could you rebase this PR (and apply Yury's comment) so we can test it again?

@bthorben
Copy link
Contributor Author

Ok, let's fork, that is easier. Longterm it would still be nice if the decoders could be moved into their own project as they are interesting also without pdf.js

This refactors getData to be more readable and extracts all the color
conversion algorithms to their own functions. The resulting code was then
cleaned up.
This also introduces a flag `forceRGBoutput` to getData, that allows to always
get the data as a `width * height * 3` bytes long RGB buffer
@bthorben
Copy link
Contributor Author

@yurydelendik please tell me if you are happy with the way you are mentioned or suppose a different wording

Benchmarking shows that this improves performance for the invitation document
from mozilla#3809 by 35%
@yurydelendik
Copy link
Contributor

I'm happy :) Let's test

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://107.21.233.14:8877/5565f24cf2939e7/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://107.22.172.223:8877/d3f5c711e76130e/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://107.22.172.223:8877/d3f5c711e76130e/output.txt

Total script time: 21.96 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.22.172.223:8877/d3f5c711e76130e/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/5565f24cf2939e7/output.txt

Total script time: 25.76 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/5565f24cf2939e7/reftest-analyzer.html#web=eq.log

@p01
Copy link
Contributor

p01 commented Apr 28, 2014

The image differences are 1 R,G,B shade here and there: IOW rounding errors imperceptible to the naked eye.

@yurydelendik
Copy link
Contributor

/botio makeref

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_makeref from @yurydelendik received. Current queue size: 0

Live output at: http://107.22.172.223:8877/3374d540b7120f3/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_makeref from @yurydelendik received. Current queue size: 0

Live output at: http://107.21.233.14:8877/abe9993bbbc8405/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/3374d540b7120f3/output.txt

Total script time: 22.36 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/abe9993bbbc8405/output.txt

Total script time: 25.93 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

yurydelendik added a commit that referenced this pull request Apr 28, 2014
@yurydelendik yurydelendik merged commit 58f697f into mozilla:master Apr 28, 2014
@yurydelendik
Copy link
Contributor

Thank you for the patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants