-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Less color conversion #4528
Less color conversion #4528
Conversation
Can we ask about raw data from the jpgjs in interleaved form and then apply (additional) colorspaces in the pdf.js, so we don't duplicate the code? |
... this optimizations will apply not only for jpeg, but for any colorspace users. We can add Ycc and Ycck if needed. |
Depending on what you mean by "raw data", this could be slightly less efficient, As the code looks right now, we convert from YCCK to RGB with intermediate CMYK, but you could actually do it all in one. So far this just eliminates creating another big buffer and touching the data again. I just hadn't have enough time to do it. As I understood it, the versions of YCC and YCCK that JPEG uses are special, so there is also not so much use in having it outside of jpg.js anyway I guess. But how to make the code efficient and nice is my biggest concern. Long term, I think it's best if all image decoders output RGB(a) directly. |
the "raw data" in this case the array returned by _getLinearizedBlockData() |
I agree that it's better to not copy the code and optimise it for the benefit of all its users, but as it currently stands, it's either partly copying or making it nice and efficient absolutely everywhere, which requires a lot of work and testing. I think refactoring the code and moving all the color conversion in an efficient way to one place should be another task. In conclusion: I would leave this efficiency improvement like this for now and mark that we should work on refactoring color conversion |
This changed the formulas for converting YCCK to CMYK again slightly. There should be no difference to the naked eye, but the following tests will show minor differences: |
The last commit showed me another thing we should discuss: Before this PR, the image was decoded in the original size and colorspace, and then later transferred to destination colorspace and size. That would make only sense if we cache the decoding result and use it again for transfer to different destinations. I have seen no relevant caching anywhere, so I assume there is none, is that correct? In conclusion: Unless there will be caching, we should always transfer to destination. |
Okay, let's do this add a comment after copyright that we (Mozilla Foundation) modified original jpgjs. Also, do we need copyToImageData? /botio-linux test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://107.21.233.14:8877/5b01d74231c3aad/output.txt |
From: Bot.io (Linux)FailedFull output at http://107.21.233.14:8877/5b01d74231c3aad/output.txt Total script time: 25.59 mins
Image differences available at: http://107.21.233.14:8877/5b01d74231c3aad/reftest-analyzer.html#web=eq.log |
@bthorben Could you rebase this PR (and apply Yury's comment) so we can test it again? |
Ok, let's fork, that is easier. Longterm it would still be nice if the decoders could be moved into their own project as they are interesting also without pdf.js |
This refactors getData to be more readable and extracts all the color conversion algorithms to their own functions. The resulting code was then cleaned up. This also introduces a flag `forceRGBoutput` to getData, that allows to always get the data as a `width * height * 3` bytes long RGB buffer
@yurydelendik please tell me if you are happy with the way you are mentioned or suppose a different wording |
Benchmarking shows that this improves performance for the invitation document from mozilla#3809 by 35%
I'm happy :) Let's test /botio test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://107.21.233.14:8877/5565f24cf2939e7/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://107.22.172.223:8877/d3f5c711e76130e/output.txt |
From: Bot.io (Windows)FailedFull output at http://107.22.172.223:8877/d3f5c711e76130e/output.txt Total script time: 21.96 mins
Image differences available at: http://107.22.172.223:8877/d3f5c711e76130e/reftest-analyzer.html#web=eq.log |
From: Bot.io (Linux)FailedFull output at http://107.21.233.14:8877/5565f24cf2939e7/output.txt Total script time: 25.76 mins
Image differences available at: http://107.21.233.14:8877/5565f24cf2939e7/reftest-analyzer.html#web=eq.log |
The image differences are 1 R,G,B shade here and there: IOW rounding errors imperceptible to the naked eye. |
/botio makeref |
From: Bot.io (Windows)ReceivedCommand cmd_makeref from @yurydelendik received. Current queue size: 0 Live output at: http://107.22.172.223:8877/3374d540b7120f3/output.txt |
From: Bot.io (Linux)ReceivedCommand cmd_makeref from @yurydelendik received. Current queue size: 0 Live output at: http://107.21.233.14:8877/abe9993bbbc8405/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/3374d540b7120f3/output.txt Total script time: 22.36 mins
|
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/abe9993bbbc8405/output.txt Total script time: 25.93 mins
|
Thank you for the patch |
This moves the colour conversion to jpg.js
from This PDF takes almost a full minute to load completely #3809 by 25%