Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for storage resource #321

Merged
merged 5 commits into from
Dec 19, 2018

Conversation

shashwathi
Copy link
Contributor

@shashwathi shashwathi commented Dec 9, 2018

  • Add support for generic storage blob resource.
  • Add implementation for gcs as of now, resource declaration
    includes bucket location(url), type params to configure credentials for private
    bucket.
  • If storage resource is added as input resource then updated build steps to pull
    the blob storage. Params of resources are set as environment variables
    for private buckets.
  • If storage is added as output resource then build step is added to
    update the blob to the bucket after execution of build steps.
  • Add documentation how to use declare resource
  • Refactor design output resources

Co-authored-by: Maria Ntalla [email protected]

Fixes #124

cc @bobcatfish @pivotal-nader-ziada

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

@knative-prow-robot knative-prow-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 9, 2018
@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2018
@googlebot
Copy link

CLAs look good, thanks!

docs/using.md Outdated
secretName: service_account.json
```

Note: `GOOGLE_APPLICATION_CREDENTIALS` is the environment variable to be set for the container to download/upload GCS resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this field in the resource? or a different env variable user has to set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this field in the resource?

No.If storage resource is private buckets then secret params expects this env variable to be set. I will update the docs to make this more clear. I did not want to hard code this env variable in resource definition because in future we could update to passing token instead of service account with minimal code change.

Secrets: []SecretParam{{
SecretName: "secretName",
SecretKey: "secretKey",
FieldName: "GOOGLE_APPLICATION_CREDENTIALS",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a field in the resource and not relying on the env variable

Copy link
Contributor Author

@shashwathi shashwathi Dec 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If bucket is public then user doesn't have to set it. Then we need some boolean flag to decide whether we want this variable or not. Interface will become complex for no reason

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so now we need the user to set an env variable of the bucket is private?

@@ -65,12 +73,13 @@ func AddInputResource(
return nil, fmt.Errorf("task %q failed to Get Pipeline Resource: %q", taskName, boundResource)
}

// git resource sets source in build so execute this always.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't support the git repo moving from one task to the next?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. Implementation gets unnecessarily complex with git being inherently supported by build as source and rest of them not. Let me try simplify this a bit more.


for _, param := range r.Spec.Params {
switch {
case strings.EqualFold(param.Name, "type"):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the two scenarios I know about (gs and s3) the type can be derived from the protocol segment of the "location". Do we need type in those cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type here refers to "gcs" and not the type of object downloading. Clients are configured to deduce the format of object from URL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the prefix of the location s3://my_bucket/ or gs://my_bucket isn't actually the object type but the protocol specifier which is usually specific to the underlying API. I suppose there may be some storage services that doesn't have a protocol as part of location (azure?) so it may be relevant in such cases.
In any case, I don't think it's a big deal.

@shashwathi
Copy link
Contributor Author

@pivotal-nader-ziada : I have addressed all your comments. Ready for another round of review.

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I am very excited about supporting a new resource for generic binary data!! I have a few questions before I get too far into the review :D


If resources are only declared in output then a copy of resource to be uploaded
or shared for next task is expected to be present under `/output/resource_name/`. If resource is declared in both input and output then resource destination path is used instead of `/output/resource_name`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woooot thanks for adding these docs! 😍

I'm feeling like the "Container Contract" section is getting big enough that we can maybe start breaking it up into subsections, maybe a section on outputs?

docs/using.md Outdated
### Storage Resource

Storage resource represents a blob storage, that contains either an object or directory.
Adding the storage resource as an input to a task will download the blob and allow the task to perform the required actions on the contents of the blob. Blob storage type "Google Cloud Storage" is supported as of now.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like:

" Blob storage type "Google Cloud Storage" (gcs)"

Since gcs is the type they need to actually use

docs/using.md Outdated
Params that can be added are the following:

1. `location`: represents the location of the blobstorage.
1. `type`: represents the type of blob storage. Currently there is implementation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be gcs

docs/using.md Outdated
1. `location`: represents the location of the blobstorage.
1. `type`: represents the type of blob storage. Currently there is implementation
for only GCS.
1. `dir`: represents whether the blob storage is a director or not.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, directory?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean if it isn't a directory?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or is this a path to a directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add more description to explain why we are using this flag

docs/using.md Outdated
1. `dir`: represents whether the blob storage is a director or not.

To use GCS private buckets, [service accounts](https://cloud.google.com/compute/docs/access/service-accounts
) could be configured to access GCS bucket for `read` and `write` authorities. Download the service account keys to create a secret.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there's an extra newline here that's interfering with service accounts being a link?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what "read and write authorities" means - is this a GCS term? Or is this "or read and write permissions"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permissions. I will correct it

docs/using.md Outdated
secrets:
- fieldName: GOOGLE_APPLICATION_CREDENTIALS
secretKey: bucket-sa
secretName: service_account.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!!!

var kubeconfigWriterImage = flag.String("kubeconfig-writer-image", "override-with-kubeconfig-writer:latest", "The container image containing our kubeconfig writer binary.")
var (
kubeconfigWriterImage = flag.String("kubeconfig-writer-image", "override-with-kubeconfig-writer:latest", "The container image containing our kubeconfig writer binary.")
bashNoopImage = flag.String("bash-noop-image", "override-with-bash-noop:latest", "The container image containing bash shell")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before I get too far into this PR - could you explain a bit (maybe even in some comments in the main.gos?) a bit more about what these binaries are for? it's a bit hard to tell at a glance

do we need a generic bash image? alternatively we could run any other container that has bash in it, maybe set the entrypoint to bash?

Copy link
Contributor Author

@shashwathi shashwathi Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a generic bash image?

I do not think so but I remember in build there was an image build-base which was custom to what build base image required (I think). I am not entirely sure why we need new custom image in this case because .ko.yaml lets you perfectly override if user wants different base image. I added comment in .ko.yaml file about the expectation of image.

@bkuschel
Copy link

It should be noted that the implementation for S3 is almost exactly the same except for the image and a very small difference in the command line arguments. Basically the initialize and the two Container Spec functions will be different.
I may follow this PR with an S3 implementation that may move some of the common functions to storage_resource.go.

@shashwathi
Copy link
Contributor Author

/test pull-knative-build-pipeline-integration-tests
/test pull-knative-build-pipeline-go-coverage

cmd/bash/main.go Outdated

func main() {
flag.Parse()
logger, _ := logging.NewLogger("", "bash_command")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename the bash_command to shell_command or something similar since you are using sh?

docs/using.md Outdated
1. `location`: represents the location of the blob storage.
1. `type`: represents the type of blob storage. Currently there is implementation for only `gcs`.
1. `dir`: represents whether the blob storage is a directory or not.
1. If artifact is a directory then `-r`(recursive) flag is used to copy all files under source directory to gcs bucket. Eg: `gsutil cp -r source_dir gs://some-bucket`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a default value for the dir field? since its not the example here, its not clear what the value would be

- fieldName: GOOGLE_APPLICATION_CREDENTIALS
secretKey: bucket-sa
secretName: service_account.json
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of fieldName here is confusing because this is not actually a field, I would prefer if it was a field on the gcs resource, doesn't have to be part of the generic interface

// to the desired destination directory
var copyStepsFromPrevTasks []corev1.Container

for i, path := range boundResource.Paths {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Paths does not make it clear at a glance that it means you have to copy from a previous task, maybe something like

  • passedPath
  • fromOtherTaskPath
  • onVolumePath
  • cachedPath
  • fetchedPath

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree field name is not intuitive and it should be updated but refactoring this would add more changes to this PR. This is already growing too much in scope. I want to follow up with another PR for that change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, makes sense since this was already there before this PR


// AddOutputResources reads the output resources and adds the corresponding container steps
// This function also reads the inputs to check if resources are redeclared in inputs and has any custom
// target directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we require that the output path will always be in output/resource_name even if its different in the input, this way it at least will be consistent and easier for users to know what to expect, it might require copying it to the output folder if its an output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if user declares output resources then expect the resource to be present in output/resource_name folder?

Then user has to add a step to copy the resource to the folder path /output/resource_name. I think it might be too much work. I agree its more predictable but I was trying to infer the path from already given information which means less yaml. I do not have strong opinion about this as both methods as have pros and cons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if user declares output resources then expect the resource to be present in output/resource_name folder?

yes exactly

Then user has to add a step to copy the resource to the folder path /output/resource_name. I think it might be too much work. I agree its more predictable but I was trying to infer the path from already given information which means less yaml. I do not have strong opinion about this as both methods as have pros and cons.

not the user has to add a step, the code automatically does that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If resource is defined in input then in code we can figure out where is the resource placed. I do not see value in copying from input resource destination -> /output/resource_name -> external bucket. As of current implementation we have input resource destination -> external bucket

  1. This would introduce additional copy step.
  2. code complexity is not solved but just moved to input resource. Now input resource has to check if resource is defined in output and add some additional step to ensure copy has happened before resolving outputs. I am not sure of this change

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👍 , few comments 👼

cmd/bash/main.go Outdated
*/

/*
Base image expects shell in $PATH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expects sh in …

@@ -88,6 +88,14 @@ func main() {
logger.Fatalf("Error building pipeline clientset: %v", err)
}

<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conflicts 😓

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just resolved them.My bad. I forgot to push local changes after resolving conflicts.

docs/using.md Outdated
steps:
- image: objectuser/run-java-jar #https://hub.docker.com/r/objectuser/run-java-jar/
command: [jar]
args: ["-cvf", "-o", "/workspace/outputs/storage-gcs/", "projectname.war", "*"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the path should be /workspace/output/storage-gcs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. Good catch

docs/using.md Outdated
Params that can be added are the following:

1. `location`: represents the location of the blob storage.
1. `type`: represents the type of blob storage. Currently there is implementation for only `gcs`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. ?

docs/using.md Outdated

1. `location`: represents the location of the blob storage.
1. `type`: represents the type of blob storage. Currently there is implementation for only `gcs`.
1. `dir`: represents whether the blob storage is a directory or not. By default storage artifact is considered not a directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3. ?

docs/using.md Outdated
1. `type`: represents the type of blob storage. Currently there is implementation for only `gcs`.
1. `dir`: represents whether the blob storage is a directory or not. By default storage artifact is considered not a directory.
1. If artifact is a directory then `-r`(recursive) flag is used to copy all files under source directory to gcs bucket. Eg: `gsutil cp -r source_dir gs://some-bucket`
1. If is a single file like zip, tar files then copy will be only 1 level deep. Eg: `gsutil cp source.tar gs://some-bucket.tar`. This will not trigger copy of sub directories in source directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. ? (and same below, number ain't right 🙃 )

@shashwathi shashwathi force-pushed the gcs-support branch 2 times, most recently from 13029ef to f7b74f7 Compare December 13, 2018 14:34
@shashwathi
Copy link
Contributor Author

@vdemeester I have addressed your comments. Let me know 👍

- Add implementation for only gcs for now, resource declaration
includes bucket location(url), type params to configure credentials for
bucket.
- If storage resource is added as input resource then updated build steps to pull
the blob storage. Params of resources are set as environment variables
for private buckets.
- If storage is added as output resource then step is added to
update the blob to the bucket after execution of build steps.
- Add documentation how to use gcs resource
- Refactor design for output
- Add e2e test with a gcs resource has secret param set. Test asserts
that volume is mounted for secrets, lists the fetched file.
- This will allow users to provide another base image if they do not
want to use "busybox" for open source compliance reasons.
- Simplify logic in taskrun to copy from previous task output or not.
current implementation: taskrun input resouce checks whether there is
dependency path set, then it executes copy from pvc instead of fetching
fresh copy
- update gcs resource to do 1 level copy if `--dir` param is not set
else `-r`(recursive) copy flag is set.
- update messed up indentation in docs
- add docs for fetching objects from private buckets
- add .ko.yaml file to override base image with bash shell
@shashwathi
Copy link
Contributor Author

/test pull-knative-build-pipeline-integration-tests

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm 👼

@knative-prow-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shashwathi, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @shashwathi ! 😍

I've swung in with a few last minute comments, main things:

  • I still don't understand why we need to be building and releasing the bash image - but this might just be me not reading carefully or some missing docs!
  • I think we need to add the new image to our release scripts? (or maybe just update this comment if this happens automatically...)
  • Can you start a doc that explains the internals of how we this all works? i.e. something that talks about how we pass outputs + inptus - not directed at users of build-pipeline, but instead at maintainers (no worries if you want to do this in a separate PR, this one has been open a while! also feel free to put in the bare minimum, I just want to get something started that we can build on as this gets more complex, and we've got the init container changes coming too...)

I'm happy to LGTM this asap so we can get it merged, dont want to leave it open for too long :D

p.s. I am especially stoked about the awesome user docs in this PR 💃


**Note**: If task is relying on output resource functionality then they cannot mount anything in file path `/workspace/output`.

If resource is declared in both input and output then input resource, then destination path of input resource is used instead of `/workspace/output/resource_name`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when skimming i though this was saying the path would still have output - could we include an example of the actual path, or link to the section in the docs on it?

value: "world"
```

**Note**: If task is relying on output resource functionality then they cannot mount anything in file path `/workspace/output`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we change the path when a resource is only an output? i'm wondering why we decided not to use the same path in both cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use the same path in both cases, @pivotal-nader-ziada had similar suggestion too. If we take that route then it becomes more burden for operator to have a copy of resource present in path /workspace/output. If resource is already declared in input then we know where the resource is already placed, then we are re-using that information.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we take that route then it becomes more burden for operator to have a copy of resource present in path /workspace/output.

Is the idea that inputs are always in /workspace/ and outputs are always in /workspace/outputs?

An alternative could be to have unique names for all inputs + outputs, and put all of them in /workspace/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. We could do this.

docs/using.md Outdated

If resource is declared in both input and output then input resource, then destination path of input resource is used instead of `/workspace/output/resource_name`.

In the following example Task `tar-artifact` resource is used both as input and output so input resource is downloaded into directory `customworkspace`(as specified in `targetPath`). Step `untar` extracts tar file into `tar-scratch-space` directory , `edit-tar` adds a new file and last step `tar-it-up` creates new tar file and places in `/workspace/customworkspace/` directory. After execution of task steps, (new) tar file in directory `customworkspace` will be uploaded to the bucket defined in `tar-artifact` resource definition.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the docs show the complete path to customworkspace, i.e. is it /customworkspace or is it /workspace/customworkspace or /workspace/output/customworkspace?

2. `type`: represents the type of blob storage. Currently there is implementation for only `gcs`.
3. `dir`: represents whether the blob storage is a directory or not. By default storage artifact is considered not a directory.
* If artifact is a directory then `-r`(recursive) flag is used to copy all files under source directory to GCS bucket. Eg: `gsutil cp -r source_dir gs://some-bucket`
* If artifact is a single file like zip, tar files then copy will be only 1 level deep(no recursive). It will not trigger copy of sub directories in source directory. Eg: `gsutil cp source.tar gs://some-bucket.tar`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -65,7 +65,6 @@ func ValidateResolvedTaskResources(params []v1alpha1.Param, rtr *resources.Resol
}
}
}
// TODO: make sure there aren't missing???
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I leave this in? There was no name or context. I was cleaning all my TODO so I might have accidentally did too much 😆

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahaha no feel free to remove it, I'm not sure what it is either though I suspect I wrote it 😅

Base image expects gsutil binary in $PATH
gsutil command uses `-m` flag(default) for faster download and uploads.
This file executes cp sub-command like `cp source destination`
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be helpful for all of our main.go's to have some info about how to use the resulting binaries, e.g. something like:

This tool is for _______

To use it, run ____
For help, run ______

For example:

______

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bobcatfish : I have updated comments to include new style. Let me know if thats good.

}

// GetParams get params
func (s *GCSResource) GetParams() []Param { return []Param{} }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be returning Location and Dir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at git, image, cluster resource as example and all of them return empty params but I can update this to return Location. Dir is like runtime param I am not sure it fits the case here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk!

"name": s.Name,
"type": string(s.Type),
"location": s.Location,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about dir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dir is decided by controller and not replacement template. So did not add that key

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm kk - it looks in the docs like ppl can specify dir explicitly:

dir: represents whether the blob storage is a directory or not. By default storage artifact is considered not a directory.

so i feel like it would make sense to provide here? but i cant imagine why anyone would want to template with it so maybe it's fine!

}

// GetDownloadContainerSpec returns an array of container specs to download gcs storage object
func (s *GCSResource) GetDownloadContainerSpec() ([]corev1.Container, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooo i like this UploadContainer + DownloadContainer pattern! do you see us doing this for other resource types as well? (e.g. git)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I did not want to do git container spec in this PR but it fits the case well. s3 resource would be a great fit too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk sounds great!

@shashwathi
Copy link
Contributor Author

shashwathi commented Dec 18, 2018

I think we need to add the new image to our release scripts? (or maybe just update this comment if this happens automatically...)

There is nothing to add to release script.

  • In the case of creds-init and git-init separate base image is used so comment is about building base image for that.
  • .ko.yaml declares busybox and google/cloud-sdk:alpine as base image(no custom image to be built) so this should just work (from my understanding)

I will update .ko.yaml.release script to include these new images

- In .ko.yaml gsutil base could be replaced with any other image with
gsutil binaries for license compliance reasons
- tests are updated to verify override-gsutil-image is used instead of
"google/cloud-sdk" image directly.
- gsutil main.go file logs command output to diagnose failures/success
- Add documentation in main.go files on how they are being used in
pipeline crd
- Update documentation to include examples about configuring private GCS buckets
- Fix merge conflicts
@shashwathi
Copy link
Contributor Author

/test pull-knative-build-pipeline-integration-tests

Helm deploy is flaky. Retesting as new changes was doc related.

@shashwathi
Copy link
Contributor Author

@bobcatfish : I have added developer docs. At this point I am dumping my thoughts about inputs/outputs contract. It is obviously not well framed. We can iterate on docs in further PR if you don't mind.

@bobcatfish
Copy link
Collaborator

Helm deploy is flaky

/me shakes fist

@bobcatfish
Copy link
Collaborator

. At this point I am dumping my thoughts about inputs/outputs contract. It is obviously not well framed. We can iterate on docs in further PR if you don't mind.

That's great!! Thanks so much for getting these started :D :D :D

@@ -0,0 +1,54 @@
## Developer docs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my very last request is that we have a link to these docs somewhere - maybe from the DEVELOPMENT.md and/or from the main README 🙏

i promise to stop the scope creep after this ❤️

@bobcatfish
Copy link
Collaborator

I added another question about the bash image and another doc request 😇 🙏 😇 thanks for your patience @shashwathi !!

- Adding docs explaining how different parts are working in conjuction. In this initial attemp I am adding inputs/outputs contract is
controllers.
- Link developer docs in README
- Add comment to main.go file about the file to update for base image override.
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-build-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1alpha1/gcs_resource.go Do not exist 86.7%
pkg/apis/pipeline/v1alpha1/pipelineresource_validation.go 68.2% 80.6% 12.4
pkg/apis/pipeline/v1alpha1/storage_resource.go Do not exist 100.0%
pkg/apis/pipeline/v1alpha1/taskrun_types.go 55.6% 63.6% 8.1
pkg/reconciler/v1alpha1/taskrun/resources/input_resources.go 87.9% 91.5% 3.6
pkg/reconciler/v1alpha1/taskrun/resources/output_resource.go Do not exist 93.1%
pkg/reconciler/v1alpha1/taskrun/resources/volume.go Do not exist 100.0%

@@ -36,6 +36,7 @@ We are so excited to have you!

- See [CONTRIBUTING.md](CONTRIBUTING.md) for an overview of our processes
- See [DEVELOPMENT.md](DEVELOPMENT.md) for how to get started
- [Deep dive](./docs/developers/README.md) into demystifying the inner workings (advanced reading material)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/meow space

@knative-prow-robot
Copy link

@bobcatfish: cat image

In response to this:

/lgtm
/meow space

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 19, 2018
@knative-prow-robot knative-prow-robot merged commit 9d7e6d3 into tektoncd:master Dec 19, 2018
solsson added a commit to solsson/knative-training that referenced this pull request Dec 19, 2018
because it adds two more containers of which the latter, source-copy, fails with
{"level":"error","ts":1545214117.0201285,"logger":"fallback-logger","caller":"bash/main.go:62","msg":"Error executing command \"cp -r /workspace/output/builtImage/. /pvc/export-npmlib/builtImage\" with arguments cp: can't stat '/workspace/output/builtImage/.': No such file or directory\n","stacktrace":"main.main\n\t/go/src/github.com/knative/build-pipeline/cmd/bash/main.go:62\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
and I'd rather not figure out why just now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants