-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmaps: Use cmap.forEach instead of Array.forEach #6329
Conversation
What is our worst case scenario from the test files in terms of cmapKeys length? |
Which test files? I was crafting a PDF+cmap for a bug report (not PDF.js) If I'm not mistaken, UTF16 can consist of 4 bytes, so the upper bound of
|
var cmapKeys = Object.keys(cmap); | ||
for (var i = 0, ii = cmapKeys.length; i < ii; ++i) { | ||
var charCode = cmapKeys[i]; | ||
var token = cmapKeys[charCode]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be cmap[charCode]
? Currently https://cdn.gonitro.com/img/primo/PrimoPDF_V5_User_Guide.pdf breaks with this patch and I think it is due to this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Fixed.
I'm wondering if it wouldn't make more sense to re-factor the code to instead use the diff --git a/src/core/evaluator.js b/src/core/evaluator.js
index ed137f8..9685a91 100644
--- a/src/core/evaluator.js
+++ b/src/core/evaluator.js
@@ -1370,11 +1370,11 @@ var PartialEvaluator = (function PartialEvaluatorClosure() {
if (cmap instanceof IdentityCMap) {
return new IdentityToUnicodeMap(0, 0xFFFF);
}
- cmap = cmap.getMap();
+ var map = [];
// Convert UTF-16BE
// NOTE: cmap can be a sparse array, so use forEach instead of for(;;)
// to iterate over all keys.
- cmap.forEach(function(token, i) {
+ cmap.forEach(function(i, token) {
var str = [];
for (var k = 0; k < token.length; k += 2) {
var w1 = (token.charCodeAt(k) << 8) | token.charCodeAt(k + 1);
@@ -1386,9 +1386,9 @@ var PartialEvaluator = (function PartialEvaluatorClosure() {
var w2 = (token.charCodeAt(k) << 8) | token.charCodeAt(k + 1);
str.push(((w1 & 0x3ff) << 10) + (w2 & 0x3ff) + 0x10000);
}
- cmap[i] = String.fromCharCode.apply(String, str);
+ map[i] = String.fromCharCode.apply(String, str);
});
- return new ToUnicodeMap(cmap);
+ return new ToUnicodeMap(map);
}
return null;
}, |
CMaps may be sparse. Array.prototype.forEach is terribly slow in Chrome (and also in Firefox) when the sparse array contains a key with a high value. E.g. console.time('forEach sparse') var a = []; a[0xFFFFFF] = 1; a.forEach(function(){}); console.timeEnd('forEach sparse'); // Chrome: 2890ms // Firefox: 1345ms Switching to CMap.prototype.forEach, which is optimized for such scenarios fixes the problem.
@Snuffleupagus Nifty function. I've updated the PR again. |
/botio-linux preview |
From: Bot.io (Linux)ReceivedCommand cmd_preview from @timvandermeij received. Current queue size: 0 Live output at: http://107.21.233.14:8877/9afea67b9ed2dcb/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/9afea67b9ed2dcb/output.txt Total script time: 0.65 mins Published |
/botio test |
From: Bot.io (Linux)ReceivedCommand cmd_test from @timvandermeij received. Current queue size: 0 Live output at: http://107.21.233.14:8877/49c8654244fb67b/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @timvandermeij received. Current queue size: 0 Live output at: http://107.22.172.223:8877/d65bf0f6f6de518/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/d65bf0f6f6de518/output.txt Total script time: 18.34 mins
|
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/49c8654244fb67b/output.txt Total script time: 19.15 mins
|
looks like a neat patch, but afaik the commit message does not hold anymore :) |
@CodingFabian The commit message was already updated. I've just edited the title of the PR to match the latest changes. |
ah cool, thank you that's what I meant :) |
cmaps: Use cmap.forEach instead of Array.forEach
Looks good, thank you for the patch! |
CMaps may be sparse. forEach is terribly slow in Chrome (and also in
Firefox) when the sparse array contains a key with a high value. E.g.