Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bash scripting, remove jq requirement, add nvme mounting #648

Merged
merged 5 commits into from
Mar 25, 2024

Conversation

jlester-msft
Copy link
Contributor

@jlester-msft jlester-msft commented Mar 15, 2024

fixes #616
fixes #454

This code change makes 3 main changes to TesApi.Web/BatchScheduler.cs:

  1. Changes the definition of the bash script to use more C# @strings. Which makes it much easier to read and work with. Note that set -e makes it so any command which fails during bash script execution will trigger an exit. This means some commands like the grep in nvme_devices=$(nvme list -o json | grep -oP ""\""DevicePath\"" : \""\K[^\""]+"" || true) needs a || true so that if no match is found the script doesn't end. However, most of the other commands don't require && chaining anymore.
  2. Removes the jq install requirement for ubuntu and uses a "here doc" style python3 replacement (see Azure Batch start-up script optimization #616)
  3. Looks for nvme devices and mounts them as a single RAID0 array onto ${AZ_BATCH_NODE_ROOT_DIR}/tasks/workitems (/mnt/batch/tasks/workitems). Software RAID seems performant and lvm and mdadm RAID0 devices seem to have similar performance. mdadm is used as it is the easier/more mature solution.

These changes have been tested on a few large example workflows. And on some L-series test workloads. Could require additional hardening but seems to be working well enough at the moment.

@BMurri + @MattMcL4475

@BMurri
Copy link
Collaborator

BMurri commented Mar 16, 2024

I recall this being discussed. I don't recall if an issue was opened. I'll go over this (and merge with main, since I'm the last to touch the start-task and I'm certain that's where the merge conflicts mainly are) on Monday.

@jlester-msft could you please respond here with the issue number? Thank you!

@jlester-msft
Copy link
Contributor Author

jlester-msft commented Mar 18, 2024

The L-series NVME issue is #454. That issue was asking for tes-runner to do this; however, we can have bash do this for now. I don't think we need to spend the dev time to code this into tes-runner right now.

@BMurri
Copy link
Collaborator

BMurri commented Mar 19, 2024

@jlester-msft This has been merged with main and will definitely be more maintainable. It does substantially alter the 1st main change mentioned in the description, though.

@BMurri BMurri removed their assignment Mar 19, 2024
@jlester-msft jlester-msft marked this pull request as ready for review March 25, 2024 20:21
@BMurri BMurri merged commit 22651c0 into main Mar 25, 2024
9 checks passed
@BMurri BMurri deleted the jlester/bash_startup_plus_nvme branch March 25, 2024 20:44
BMurri added a commit that referenced this pull request Mar 25, 2024
giventocode pushed a commit that referenced this pull request Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants