-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote cache import network error is fatal but should be considered a cache miss #2836
Comments
Correct that the issue in here is the lazy semantics. The cache load was allowed to fail but when we added lazy semantics this meant that for many errors the error doesn't happen on the actual cache load phase but later when the ref gets unlazied. I tried to fix it for the errors that happened because github removed a blob with this hack #2387 066a011 but transfer errors would need a different approach. |
docker/build-push-action#577 also has same root cause I believe |
GHA cache is currently broken (refs actions/cache#820); I'm seeing a build (using It seems to ignore the cache failure when reading, but fails when trying to write to the cache:
|
I got something similar, I think I was flooding GHA when exporting cache and getting rate limited:
This led to failed builds even though the images were correctly built |
It's a shame, but when the GHA cache rate-limits that comes across as a build failure see moby/buildkit#2836
Is this not solved by: #3430 |
Builds regularly fail when multiple Dependabot PRs are being built simultaneously, as they seem to overload the service. This change should allow them to continue, even if they cannot use the cache. Refs #905, moby/buildkit#2836
Seems to be, I've just seen
appear in a GitHub action and the job continued. |
That's different than the problem in the issue description. Network errors are now non-fatal when you are either exporting cache or at the time that you resolve the cache metadata during an import. The remaining issue is that errors that happen during the actual pull of remote cache layers will be fatal. This is because pulling of layers is lazy and happens after the remote cache metadata has been resolved. |
In Dagger CI we got an error when some GHA cache was being imported for an ExecOp:
The error is just a problem with GHA, but Buildkit failed the solve as a result of it. Ideally Buildkit should just treat this as a cache miss rather than a fatal error.
AFAIK that's supposed to be the behavior, but here I suspect the problem is that there was a cache hit successfully but the error only happened once the remote blob was being unlazied. If so, this is a pretty tricky problem as it would mean that we need the solver to "go back" and now treat the vertex that was previously a cache hit as a miss and just re-execute the vertex.
The text was updated successfully, but these errors were encountered: