-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: run multidim-interop.yml on self-hosted runners #154
Conversation
a2cd76d
to
0360dee
Compare
env: | ||
AWS_BUCKET: ${{ inputs.s3-cache-bucket }} | ||
AWS_REGION: ${{ inputs.aws-region }} | ||
AWS_ACCESS_KEY_ID: ${{ inputs.s3-access-key-id }} | ||
AWS_SECRET_ACCESS_KEY: ${{ inputs.s3-secret-access-key }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way, we modify the env only for this step.
with: | ||
config: ${{ steps.buildkit.outputs.config }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On standard, hosted GitHub Actions runners this will be an empty string. Which is the default.
Half an hour is still pretty slow. This job usually finishes in 15min on PRs for us? Is that because we filter? Will this mean we will also be able to have a cache for the PR docker builds that use |
m5.large is 2 cpus and 8GB of ram. Isn't that roughly the same as the hosted runners? https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources |
I moved you to c5.4xlarge, but, ultimately, you can pick any instance type that suits your workload characteristic the best - https://instances.vantage.sh/. I requested the job that runs on self-hosted (the one with cache) to use 16 workers here - 255238e I also had to request the build part to remain sequential - 0626f93 The job - https://github.com/libp2p/test-plans/actions/runs/4465794780/jobs/7843205397?pr=154 - took 20 minutes to finish. It spent ~7 minutes building stuff and then ~13 minutes running the tests. |
That is an amazing improvement! With the filtering that we do in the pull-request workflows, that is likely to be even quicker! |
Is anything blocking this from merging? I'd be great to improve the runtime of this job. |
defer until after ipfs thing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work here! I'd suggest we split this into two PRs so we can land parts of it quicker and discuss if we can perhaps avoid introducing the "input".
f4b07dd
to
bc94d5f
Compare
rebased, fyi |
26 minutes on this branch versus 54 minutes on master. @galargh What's the cost to running this? It's nice it's twice as fast, but it's also currently free. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One open question around cost, but changes look good. Please request another review when the question has been answered.
@galargh Friendly ping on my open question :) |
I totally missed the question about cost! Let me do a quick calculation for you. The monthly cost of moving on with this should be around $38/month (see the AWS resource calculation - https://calculator.aws/#/estimate?id=9a5a201262a70d9f491015a5aa81a4aeaf1ae966). I based this on:
|
With the recently added implementations, the interop test are now by far our longest CI job at > 20 minutes: https://github.com/libp2p/rust-libp2p/actions/runs/5387398505/jobs/9778664726 What is the process around deciding whether the spend is worth it? |
Co-authored-by: Piotr Galar <[email protected]>
Will merge after CI passes. Thank you so much @galargh! ❤️ |
This reverts commit 7e868ea.
This PR revives the proposal from #92 (c53be4c).
Please note that the first build on self-hosted runners had to populate the cache https://github.com/libp2p/test-plans/actions/runs/4447196185/attempts/1. This suggests that some part of the machine config influences the cache keys. It might need some tweaks to make it interoperable with hosted GitHub Actions runners; or it might be a good idea to have another job which populates the cache for hosted GitHub Actions runners.
The subsequent runs were using cache as expected:
I did not try to increase worker count nor modify docker run parameters to accommodate the more powerful environment.
This is the self-hosted runner definition: https://github.com/pl-strflt/tf-aws-gh-runner/blob/main/runners.tf#L28
This is the packer config the AMI was built from: https://github.com/pl-strflt/tf-aws-gh-runner/blob/main/images/ubuntu-jammy/kubo.pkrvars.hcl
We can modify the runners as you wish of course.