-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix travis builds #38340
Fix travis builds #38340
Conversation
After reading some articles [1] [2] yesterday about Docker and the "init" process I got to thinking about the problems that we've been seeing on Travis. The basic problem is that a Linux system may need an "init" process to work properly when processes become zombies. Docker by default doesn't handle this and the root process typically isn't an init process, so this can occasionally cause quite a few problems. We've been seeing spurious errors on Travis inside containers which look like OOM and such, but my guess is that zombie processes were being reparented to the top-level shell. The shell didn't expect the zombies and then behaved very strangely. This commit fixes these problems by using Yelp's "dumb-init" program [2] as the init process in all of our containers. This ensures that there's a valid init ready to reap children when they're reparented, which our test suite apparently generates a bunch of throughout the tests and such. [1]: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ [2]: https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html
(rust_highfive has picked a reviewer for you, use r? to override) |
@bors: r+ |
📌 Commit 5e991e0 has been approved by |
Fix travis builds After reading some articles [1] [2] yesterday about Docker and the "init" process I got to thinking about the problems that we've been seeing on Travis. The basic problem is that a Linux system may need an "init" process to work properly when processes become zombies. Docker by default doesn't handle this and the root process typically isn't an init process, so this can occasionally cause quite a few problems. We've been seeing spurious errors on Travis inside containers which look like OOM and such, but my guess is that zombie processes were being reparented to the top-level shell. The shell didn't expect the zombies and then behaved very strangely. This commit fixes these problems by using Yelp's "dumb-init" program [2] as the init process in all of our containers. This ensures that there's a valid init ready to reap children when they're reparented, which our test suite apparently generates a bunch of throughout the tests and such. [1]: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ [2]: https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html
Well I'm glad to see all that digging we did yesterday had some useful outcome. :) We really should just get your Linux CI moved to Taskcluster. Not that it would have necessarily made a difference for this specific issue, but we have a pretty nice working setup with in-tree Dockerfiles for Firefox builds, and I suspect you'd benefit from some of the other flexibility. |
After reading some articles 1 2 yesterday about Docker and the "init"
process I got to thinking about the problems that we've been seeing on Travis.
The basic problem is that a Linux system may need an "init" process to work
properly when processes become zombies. Docker by default doesn't handle this
and the root process typically isn't an init process, so this can occasionally
cause quite a few problems.
We've been seeing spurious errors on Travis inside containers which look like
OOM and such, but my guess is that zombie processes were being reparented to the
top-level shell. The shell didn't expect the zombies and then behaved very
strangely.
This commit fixes these problems by using Yelp's "dumb-init" program 2 as the
init process in all of our containers. This ensures that there's a valid init
ready to reap children when they're reparented, which our test suite apparently
generates a bunch of throughout the tests and such.