Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If an init container in a TaskRun fails, entire TaskRun should fail #682

Closed
bobcatfish opened this issue Mar 26, 2019 · 4 comments · Fixed by #687
Closed

If an init container in a TaskRun fails, entire TaskRun should fail #682

bobcatfish opened this issue Mar 26, 2019 · 4 comments · Fixed by #687
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@bobcatfish
Copy link
Collaborator

Expected Behavior

If an init container in the Pod created for a TaskRun fails, then the subsequent steps/containers will not be able to run. Therefore in this case, the entire TaskRun should be marked as failed.

Actual Behavior

If an init container fails, the TaskRun status makes it look like it is still executing (i.e. the status of the init containers is not taken into account when determining if the TaskRun has failed), e.g. this is the status of a TaskRun I created where the init container that tried to checkout a GCS storage resource failed:

  status:
    conditions:
    - lastTransitionTime: "2019-03-26T22:49:55Z"
      reason: Building
      status: Unknown
      type: Succeeded
    podName: publish-run-pod-55cc64
    startTime: "2019-03-26T22:49:21Z"
    steps:
    - running:
        startedAt: "2019-03-26T22:49:39Z"
    - running:
        startedAt: "2019-03-26T22:49:40Z"
    - terminated:
        containerID: docker://0b40cffc7651cf781b8e4c0242377e3ca2499f9a0a0c572ced025b3f562ec643
        exitCode: 0
        finishedAt: "2019-03-26T22:49:27Z"
        reason: Completed
        startedAt: "2019-03-26T22:49:27Z"
    - running:
        startedAt: "2019-03-26T22:49:39Z"
    - running:
        startedAt: "2019-03-26T22:49:39Z"
    - terminated:
        containerID: docker://33fb0da5e4dadf73c2f9c8e4d62518d960ac71ce159f26286b1558766191302d
        exitCode: 1
        finishedAt: "2019-03-26T22:49:40Z"
        reason: Error
        startedAt: "2019-03-26T22:49:37Z"
    - running:
        startedAt: "2019-03-26T22:49:39Z"
    - running:
        startedAt: "2019-03-26T22:49:40Z"
    - running:
        startedAt: "2019-03-26T22:49:40Z"
    - running:
        startedAt: "2019-03-26T22:49:40Z"
    - terminated:
        containerID: docker://8deba837738b7303625824c64aaed620025ac58aeba44337a64be535ffe4d74a
        exitCode: 0
        finishedAt: "2019-03-26T22:49:41Z"
        reason: Completed
        startedAt: "2019-03-26T22:49:41Z"

The condition is:

    conditions:
    - lastTransitionTime: "2019-03-26T22:49:55Z"
      reason: Building
      status: Unknown
      type: Succeeded

Yet the init container failed:

    - terminated:
        containerID: docker://33fb0da5e4dadf73c2f9c8e4d62518d960ac71ce159f26286b1558766191302d
        exitCode: 1
        finishedAt: "2019-03-26T22:49:40Z"
        reason: Error
        startedAt: "2019-03-26T22:49:37Z"

(Note I ran into this because I tried to run our release Task and ran into #646)

Steps to Reproduce the Problem

  1. Create a Task that takes a PipelineResource, e.g. a GCS storage resource
  2. Create a TaskRun that points at a PipelineResource which will have an error, e.g. reproduce Fetching empty GCS storage bucket of type dir fails #646 by providing an empty bucket but using dir type

Additional Info

When we switched away from using init containers for steps, the status was briefly completely broken. In #634 we updated this logic to look at container statuses instead of init containers - but it looks like we need to look at both (and our integration tests need to cover this).

@bobcatfish bobcatfish added kind/bug Categorizes issue or PR as related to a bug. good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. labels Mar 26, 2019
@bobcatfish
Copy link
Collaborator Author

Looking at this a bit more, the problem might be a bit deeper than I thought, I'm seeing the same behaviour even when it's a step that fails, i.e. the status stays at:

    conditions:
    - lastTransitionTime: "2019-03-26T22:49:55Z"
      reason: Building
      status: Unknown
      type: Succeeded

🤔

@vdemeester
Copy link
Member

/assign

@vdemeester
Copy link
Member

Reproduced by the following TaskRun. I think the entrypoint waits for ever in case of one step failing — looking into fixing that.

apiVersion: tekton.dev/v1alpha1
kind: TaskRun
metadata:
  name: build-simple
spec:
  taskSpec:
    steps:
    - name: echo
      image: docker.io/library/busybox
      command:
      - /bin/sh
      args:
      - -c
      - "echo echo"
    - name: exit
      image: docker.io/library/busybox
      command:
      - /bin/sh
      args:
      - -c
      - "exit 1"
    - name: echo-again
      image: docker.io/library/busybox
      command:
      - /bin/sh
      args:
      - -c
      - "echo again"
  trigger:
    type: manual

@bobcatfish
Copy link
Collaborator Author

Awwwwwesome @vdemeester , can confirm that the problem I was having yesterday isn't happening anymore today!!

    - lastTransitionTime: "2019-03-27T18:22:02Z"
      message: 'build step "build-step-build-push-base-images" exited with code 1
        (image: "docker-pullable://gcr.io/kaniko-project/executor@sha256:d9fe474f80b73808dc12b54f45f5fc90f7856d9fc699d4a5e79d968a1aef1a72");
        for logs run: kubectl -n default logs publish-run-pod-200eff -c build-step-build-push-base-images'
      status: "False"
      type: Succeeded

Thanks for the fix - so fast!!!! 🏎️

a cat staged in front of a laptop typing very quickly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants