Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker implementation #3 #635

Closed
wants to merge 112 commits into from
Closed

Conversation

jviereck
Copy link
Contributor

@jviereck jviereck commented Oct 8, 2011

This is a merged version of the worker branch that hopefully works. After the merge, the fonts weren't working for some pages so I've redone all the font changes, started from a scratch version of fonts.js from master again an build from there. The font's are currently processed (new Font(...)) on the main thread and not in a worker as well, but let's keep it simple for now and get something landed.

Compared to the wish list from Chris on the other pull request, I've addressed some of the points, but some still stay opened:

  • The worker functionality is disabled right now as the feature detection needs to get implemented
  • The "can we use UInt8Array as image data" isn't implemented due to missing feature detection
  • The PDFDoc and WorkerPDFDoc as well as the Page and WorkerPage object haven't merge yet, as this requires a complete refactoring of these objects and I don't want to break more stuff/refactore more code then I've already done with these patches. However, note that the current PDFDoc got renamed to PDFDocModel and WorkerPDFDoc is now PDFDoc. That way, we don't have to change any of the APIs as you just can call the new PDFDoc as you did before.

Let's see what the bot says to this patches. Can someone please take a look at the last changes if they make sense?

Added by @arturadib - Links to previous pull requests and discussion: #459 #130 #95 #45

…oesnt have a document to attach the canvas to
@jviereck
Copy link
Contributor Author

@pdfjsbot test

@pdfjsbot
Copy link

Processing command test by user jviereck. Queue size: 0

Live script output is available at: http://184.73.87.52:8989/2370679.txt

[bot:processed:2370679]

@pdfjsbot
Copy link

ERROR(s) found

Output:

========== Killing any stray processes

========== Cloning pull request repo
Cloning into ....

========== Running 'make lint'
gjslint ./charsets.js ./cidmaps.js ./crypto.js ./fonts.js ./glyphlist.js ./metrics.js ./pdf.js ./worker.js utils/cffStandardStrings.js utils/fonts_utils.js worker/console.js worker/message_handler.js worker/pdf_worker_loader.js worker/processor_handler.js web/compatibility.js web/viewer.js test/driver.js examples/helloworld/hello.js extensions/firefox/bootstrap.js extensions/firefox/components/pdfContentHandler.js
----- FILE  :  /home/ubuntu/pdf.js-bot/tmp/tests/3142c2a7dae2d4b95f6980a6d9612ff34c8f234b/pdf.js -----
Line 4787, E:0001: Extra space at end of line

Found 1 errors, including 0 new errors, in 1 files (19 files OK).
\u0007
Some of the errors reported by GJsLint may be auto-fixable using the script
fixjsstyle. Please double check any changes it makes and report any bugs. The
script can be run by executing:

fixjsstyle ./charsets.js ./cidmaps.js ./crypto.js ./fonts.js ./glyphlist.js ./metrics.js ./pdf.js ./worker.js utils/cffStandardStrings.js utils/fonts_utils.js worker/console.js worker/message_handler.js worker/pdf_worker_loader.js worker/processor_handler.js web/compatibility.js web/viewer.js test/driver.js examples/helloworld/hello.js extensions/firefox/bootstrap.js extensions/firefox/components/pdfContentHandler.js 
make: *** [lint] Error 1

========== Fetching test/ from upstream

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING: This pull request contains changes to test-related files in test/.
         The tests below will thus be performed as per these *new test files*.
         If upstream contains important tests not present in this PR, someone
         will have to merge it here and run the tests again.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

========== Cloning reference images repo into test/ref/
Initialized empty Git repository in /home/ubuntu/pdf.js-bot/tmp/tests/3142c2a7dae2d4b95f6980a6d9612ff34c8f234b/test/ref/.git/

========== Running 'make bot_test'
cd test; \
    python -u test.py \
    --browserManifestFile=resources/browser_manifests/browser_manifest.json \
    --manifestFile=test_manifest.json
Launching firefox
Xlib:  extension "RANDR" missing on display ":1".
TEST-PASS | eq test tracemonkey-eq | in firefox
TEST-PASS | forward-back-forward test tracemonkey-fbf | in firefox
TEST-PASS | load test html5-canvas-cheat-sheet-load | in firefox
TEST-PASS | load test intelisa-load | in firefox
TEST-PASS | load test pdfspec-load | in firefox
TEST-PASS | load test shavian-load | in firefox
TEST-PASS | eq test sizes | in firefox
TEST-PASS | eq test openweb-cover | in firefox
TEST-PASS | eq test plusminus | in firefox
TEST-PASS | load test openoffice-pdf | in firefox
TEST-PASS | load test openofficecidtruetype-pdf | in firefox
TEST-PASS | load test openofficearabiccidtruetype-pdf | in firefox
TEST-PASS | load test arabiccidtruetype-pdf | in firefox
TEST-PASS | load test complexttffont-pdf | in firefox
TEST-PASS | eq test thuluthfont-pdf | in firefox
TEST-PASS | eq test wnv_chinese-pdf | in firefox
TEST-PASS | eq test i9-pdf | in firefox
TEST-PASS | load test hmm-pdf | in firefox
TEST-PASS | load test rotation | in firefox
TEST-PASS | load test ecma262-pdf | in firefox
TEST-PASS | load test jai-pdf | in firefox
TEST-PASS | eq test cable | in firefox
TEST-PASS | eq test pdkids | in firefox
TEST-PASS | eq test artofwar | in firefox
TEST-PASS | eq test wdsg_fitc | in firefox
TEST-PASS | eq test unix01 | in firefox
TEST-PASS | load test fips197 | in firefox
TEST-PASS | load test txt2pdf | in firefox
TEST-PASS | load test f1040 | in firefox
TEST-PASS | load test hudsonsurvey | in firefox
TEST-PASS | load test extgstate | in firefox
TEST-PASS | eq test usmanm-bad | in firefox
TEST-PASS | load test vesta-bad | in firefox
TEST-PASS | load test ibwa-bad | in firefox
TEST-PASS | eq test tcpdf_033 | in firefox
TEST-PASS | eq test simpletype3font | in firefox

All tests passed.
Process firefox is still running. Killing.
Runtime was 1739 seconds

========== Cleaning up
./run-test: line 202:   833 Killed                  Xvfb :1 > /dev/null 2> /dev/null

All done.


_____________________________ stderr:

Bot response time: 30.00 mins

@jviereck
Copy link
Contributor Author

@cgjones: Once you make the review of this, you might want to take a look at the resulting diffs instead of the commits. I've done some back and forth on the font loading to come to the point where I've reverted the entire changes done to font.js and use the current version upstream basically.

Starting from commit be73cb4, you can review the commits one by one as there is no reverting of code from that point on.

Sorry once again for how the code is the way it is :(

// exc = e.toString();
// continuation(exc);
// throw e;
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code commented out? Is this temporary?

@arturadib
Copy link
Contributor

Julian,

I've gone over some of the patch. First my opinion on how to handle the patch, then the nitty gritty.

A lot of this seems to be work-in-progress so it was pretty hard to follow the logic. My 'business' mind would prefer to see a major patch like this to at least come with an equally large payoff, but my understanding is that we can't yet enable workers to enjoy the performance gain, as you're still sorting out some issues. So it's a bit of a risk to land a large, partially complete patch in our code base without the actual benefit in sight. (If it only takes a small additional patch to get the whole thing working, then I'd personally prefer to wait until we can actually see the performance gain before landing the patch).

So my suggestion would be to either (a) wait for a patch with workers working and less partially-done code, or (b) use your current pull request as reference for a more incremental contribution, such that at every stage the purpose of the change is clear and potentially useful in its own right. (That could become a goal for next quarter, in which case I'd be happy to take on the challenge if you're tired of working on it).

Again that's just my opinion, but @cgjones is obviously much better equipped to make a decision here.

As to the details, here are some of my questions/points. Again, I've found it pretty hard to follow the patch, and the nomenclature was a bit confusing, but maybe some of this is because things are not quite ready yet?

  • I don't quite understand the file organization. We have a worker.js file and then a separate set of worker/* files, which would make me think these are worker-specific files. However both worker.js and worker/message_handler.js seem to contain functions that are called by the main thread. Is it possible to have worker-specific files be clearly separate from shared ones? Also why so many files under worker/ when altogether they are only a couple hundred lines long?
  • What's the difference between an IR and an IRQueue? Sometimes these terms seem to be used interchangeably in your patch, but the term IRQueue suggests that there is a difference. For example, in CanvasGraphics.executeIRQueue(), the first argument is a codeIR, but when you actually call the function you seem to pass IRQueues. If they're the same thing, is there a need for introducing the term IRQueue? Why not just IR, or even just code as in the previous version? I found it pretty confusing.
  • Both PDFDoc and PDFDocModel seem to store the entire document stream into this.stream. I thought the point of having a separate model at this point was to avoid replicating the document across workers so as to save memory?
  • Why does Catalog.traverseKids() use new Page() while PDFDoc.getPage() uses new WorkerPage()? I guess I don't understand the difference between Page() and WorkerPage() or why we need two, but that might be because I don't get the 'big picture' yet.

Anyway, I look forward to hearing from everyone else here. Again, Brendan and I will be doing a major file restructuring off the master branch next week in Mountain View; it'd be kinda painful to bring your changes into the new files if the PR doesn't land before then.

It'd be nice to know what's the final word on this.

@jviereck
Copy link
Contributor Author

  • What's the difference between an IR and an IRQueue? Sometimes these terms seem to be used interchangeably in your patch, but the term IRQueue suggests that there is a difference. For example, in CanvasGraphics.executeIRQueue(), the first argument is a codeIR, but when you actually call the function you seem to pass IRQueues. If they're the same thing, is there a need for introducing the term IRQueue? Why not just IR, or even just code as in the previous version? I found it pretty confusing.

IR means only one command whereas IRQueue means a queue/array of IR commands.

  • Both PDFDoc and PDFDocModel seem to store the entire document stream into this.stream. I thought the point of having a separate model at this point was to avoid replicating the document across workers so as to save memory?

No, that wasn't the point. Right now we create the PDF document a few times and there is some refactoring needed such that we need to store the PDF only once.

  • Why does Catalog.traverseKids() use new Page() while PDFDoc.getPage() uses new WorkerPage()? I guess I don't understand the difference between Page() and WorkerPage() or why we need two, but that might be because I don't get the 'big picture' yet.

WorkerPage is the object represented to the user to interact with, whereas the Page object is the internal object, that does most of the lifting. The WorkerPage talks to PDF to get the page rendered, whereas Page calls into PDFModel. A little picture

 WorkerPage < (talks to) > PDFDoc
 Page < (talks to) > PDFDocModel

});
}
setGState: function canvasGraphicsSetGState(states) {
for (var i = 0; i < states.lenght; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'lenght' that worries me that this passed the extgstate pdf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see now extgstate is a load test, where it should be an eq test. Let me go fix that.

@arturadib
Copy link
Contributor

I just wanted to put into writing what we discussed over the phone. In addition to my points above:

  • We don't seem to know where our major bottlenecks are, so it's hard to know what's the simplest solution that solves the problem - the current solution is pretty complex;
  • If the main problem now is that the UI sometimes freezes, a single all-in-one worker might do the trick - that way we don't have to break up the objects and deal with dependencies, etc. Have we tried this simpler solution yet to see if it solves our problem?
  • Going for more than one worker today will only benefit a very small fraction of the market as most computers have 1-2 cores anyway. In fact, if we launch more workers than the number of cores things might actually slow down due to the message-passing and context-switching overhead. (I guess this will require some browser hacks to detect the # of cores).

@joneschrisg
Copy link
Contributor

@jvierek I apologize for the lag again. I've barely been keeping my head above water lately.

Architecturally, this approach still looks fine; looks like it hasn't changed from the previous patch. The fixes you've made look good. It looks like the comments about file organization weren't addressed, and I think Artur was confused similarly to me. We need to figure that part out.

As for landing, I mostly agree with @arturadib's comments above. We should land this with a functional worker backend. When this architecture runs only on the main thread, I think we'll mostly regress throughput without much responsiveness gain, though there are some pdfs it could improve.

@arturadib @brendandahl @notmasteryet how well do you guys understand the architecture these patches are moving towards? One thing we neglected in the previous pull request was getting the core devs on the same page; it's important because this change will permeate a lot of pdf.js, and affect the way patches need to be written. I think the first "action item" is to make sure the new architecture is documented and understood. @jvierek, want to start with a write-up and then post the link to the mailing list?

Next, we should figure out how the worker backend will interact with the code reorganization being planned (I haven't been following the details). Maybe you guys could conf call next week to discuss?

With both of those settled, I think it'll be clearer how to get the current patch here ready to land, and it may make sense to distribute some work so that everyone gets a feel for the new backend.

Sound good?

If the main problem now is that the UI sometimes freezes, a single all-in-one worker might do the trick - that way we don't have to break up the objects and deal with dependencies, etc. Have we tried this simpler solution yet to see if it solves our problem?

Yes, that entirely solves the responsiveness problem. It doesn't mean we don't have to deal with dependencies: we still have to wait on fonts and images to load. I'm not sure that a lot of lines of code could be removed from this patch, but it's likely that it could be made conceptually simpler. Next week may be a good time to figure that out.

Going for more than one worker today will only benefit a very small fraction of the market as most computers have 1-2 cores anyway. In fact, if we launch more workers than the number of cores things might actually slow down due to the message-passing and context-switching overhead. (I guess this will require some browser hacks to detect the # of cores).

Everyone (to a first approximation) has at least 2 HW threads, phones being the exception for the moment. I wouldn't be surprised if 4 (dual-core+HT) is more common than 2 nowadays. You're right that scheduling work is hard. That's a bridge we'll have to cross, and we may need new platform support. But we need to be able to scale up to more HW threads: that's where perf wins for this kind of code will come from from now on. That doesn't mean we should neglect other wins like parallel rendering of individual pages, or same-page optimizations like SIMD.

@jviereck
Copy link
Contributor Author

okay, I'm going to start from scratch again and create a as-simple-as-possible version that for now runs only in one thread. Closing this pull request as there gonna be a new one.

@jviereck jviereck closed this Oct 14, 2011
@arturadib
Copy link
Contributor

Hey Chris, thanks so much for making time for this!

I think the first "action item" is to make sure the new architecture is documented and understood.

That would be extremely helpful. I know that this will be hard no matter what, but since it'll affect the overall architecture of the code it'd be nice to be able to understand the big picture first.

Big thanks to Julian for plowing along and making so much progress already. Because of his work we will all be in a better position to help and make a more educated decision.

Next, we should figure out how the worker backend will interact with the code reorganization being planned

Right, since we'll be working on this already next week, maybe @jviereck can postpone writing code until after we're done?

@jviereck Meanwhile, as per Chris's suggestion, perhaps you can open an issue like #646 explaining the overall architecture and what files will be affected, to gather feedback?

Everyone (to a first approximation) has at least 2 HW threads, phones being the exception for the moment. I wouldn't be surprised if 4 (dual-core+HT) is more common than 2 nowadays.

Makes sense. I guess it'd be helpful to quantify this a bit better, so I've filed a bug with the metrics team:

https://bugzilla.mozilla.org/show_bug.cgi?id=694565

As to HTs, beyond freeing up the UI with one additional thread, the benefit from using them remains to be seen, as although they mitigate context-switching, there's still execution-switching and message-passing overhead. In other words, launching a total of 4 threads because we have 2 cores + 2 HTs might be slower than launching only 2. I suppose we won't know until we try!

Speaking of which...

@cgjones Do you know if we can access the # of cores available via XPCOM or some other browser-specific method? I've found some threads regarding the spec of Web Workers but nothing concrete yet:

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-November/024058.html
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/023993.html

@arturadib
Copy link
Contributor

@jviereck - in response to your IRC comments (you logged out right when I wrote a response! :)):

jviereck- artur: hi. I've started already hacking on the new worker implementation
jviereck- hope to get it mostly down today/over the weekend
jviereck- I don't expect too much breakage during the file merge in case this gonna land after that one
jviereck- and I defenitly want to have this one off my plate ASAP
jviereck- the way I write the code now should also be more selfexplained with a few more comments + notes in the commits
jviereck- hoping that's gone help out :/

Man, that's fantastic you're hungry to finish this! And I do like Chris's idea to first have some sort of documentation explaining the overall architecture to everyone. This is very much in line with what Tom Preston-Werner (Github's founder) asks his own team to do before trying to code anything big that affects other folks:

It's a great practice that I've adopted for a while too (hence #646) - I highly recommend reading this post.

Do you think you can do that before you get too far along in the code? I'd hate to see you work so hard again before we have a chance to chime in on your plans.

What do you think?

@jviereck
Copy link
Contributor Author

What do you think?

Sounds good. Here we go:

https://github.com/andreasgal/pdf.js/issues/663

If you are short on time, just start reading form the "Birds eye overview"

@joneschrisg
Copy link
Contributor

@cgjones Do you know if we can access the # of cores available via XPCOM or some other browser-specific method? I've found some threads regarding the spec of Web Workers but nothing concrete yet:

We have internal (C++) APIs for this, but definitely nothing exposed to The Web, and nothing through XPCOM AFAIK. We have to be very careful about exposing that kind of information; ideally we want the browser to choose the #workers allocated to pages. I had a WIP "parallelMap()" kind of API for workers, but ran out of time before anything came of it. For pdf.js, that may or may not be the primitive we want. We'll let the impl drive new APIs/specs.

@arturadib arturadib mentioned this pull request Nov 7, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants