Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: fix VM creation and disk type detection in grow command #141835

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shailendra-patel
Copy link
Contributor

@shailendra-patel shailendra-patel commented Feb 21, 2025

This commit fixes two bugs in the roachprod grow command:

  1. Fixed VM creation failure due to hostname collisions. Previously, we were using the same VM name multiple times across different zones, causing "RESOURCE_ALREADY_EXISTS" errors. The fix extracts the hostname allocation logic into a new helper function (computeHostNamesPerZone) that properly distributes unique hostnames across zones.

  2. Fixed incorrect disk type detection in instance group templates. We were checking d.InitializeParams.DiskType for "SCRATCH" instead of d.Type, which prevented proper identification of local SSD volumes.

The bug is likely to be introduced in #137956

Epic: none

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link

github-actions bot commented Feb 21, 2025

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note Threshold
sec/op 9.683m ±3% 9.653m ±3% ~ p=0.912 n=10 3.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
🟢 allocs/op 10.16k ±0% 10.15k ±0% -0.13% p=0.005 n=10 2.0%
🟢 B/op 2.183Mi ±0% 2.179Mi ±0% -0.20% p=0.023 n=10 2.0%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3393c69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3393c699d77ba04fa4eef871fea0a5b6179d0c2d/bin/pkg_sql_tests benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/b0b0e69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/b0b0e695f4623659574ebf45a5a140bb58673082/bin/pkg_sql_tests benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=b0b0e69 --new=3393c69 ./pkg/sql/tests
⚪ Sysbench [KV, 1node, local, oltp_read_only]
Metric Old Commit New Commit Delta Note Threshold
sec/op 687.7µ ±1% 690.8µ ±1% ~ p=0.075 n=10 2.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 439.0 ±0% 439.0 ±0% ~ p=1.000 n=10 1.5%
B/op 254.2Ki ±0% 254.2Ki ±0% ~ p=0.631 n=10 1.5%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3393c69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3393c699d77ba04fa4eef871fea0a5b6179d0c2d/bin/pkg_sql_tests benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/b0b0e69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/b0b0e695f4623659574ebf45a5a140bb58673082/bin/pkg_sql_tests benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_read_only$ --old=b0b0e69 --new=3393c69 ./pkg/sql/tests
🟢 Sysbench [KV, 1node, local, oltp_write_only]
Metric Old Commit New Commit Delta Note Threshold
sec/op 1.374m ±1% 1.376m ±1% ~ p=0.971 n=10 2.5%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 1.397k ±0% 1.397k ±0% ~ p=0.745 n=10 1.8%
🟢 B/op 290.5Ki ±0% 289.8Ki ±0% -0.23% p=0.003 n=10 1.8%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3393c69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3393c699d77ba04fa4eef871fea0a5b6179d0c2d/bin/pkg_sql_tests benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3393c69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/b0b0e69/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/b0b0e695f4623659574ebf45a5a140bb58673082/bin/pkg_sql_tests benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/b0b0e69/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_write_only$ --old=b0b0e69 --new=3393c69 ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/3393c699d77ba04fa4eef871fea0a5b6179d0c2d/13460265651-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/b0b0e695f4623659574ebf45a5a140bb58673082/13460265651-1/\* old/
Legend
  • Neutral: No significant performance change.
  • 🟡 Warning: Slight degradation, likely due to variance, but still within thresholds.
  • 🔴 Regression: Likely performance regression, requiring investigation.
  • 🟢 Improvement: Possible performance gain.

No regressions detected!

built with commit: 3393c699d77ba04fa4eef871fea0a5b6179d0c2d

@shailendra-patel shailendra-patel force-pushed the fix-grow-cmd branch 2 times, most recently from 29afb2e to 3c52b2e Compare February 21, 2025 15:43
This commit fixes two bugs in the roachprod grow command:

1. Fixed VM creation failure due to hostname collisions. Previously, we were
  using the same VM name multiple times across different zones, causing
  "RESOURCE_ALREADY_EXISTS" errors. The fix extracts the hostname allocation
  logic into a new helper function (computeHostNamesPerZone) that properly
  distributes unique hostnames across zones.

2. Fixed incorrect disk type detection in instance group templates. We were
  checking d.InitializeParams.DiskType for "SCRATCH" instead of d.Type,
  which prevented proper identification of local SSD volumes.

The bug is likely to be introduced in cockroachdb#137956

Epic: none

Release note: None
@shailendra-patel shailendra-patel marked this pull request as ready for review February 21, 2025 15:46
@shailendra-patel shailendra-patel requested a review from a team as a code owner February 21, 2025 15:46
@shailendra-patel shailendra-patel requested review from srosenberg, golgeek and nameisbhaskar and removed request for a team February 21, 2025 15:46
@BabuSrithar
Copy link
Contributor

Please explore adding test cases.

Copy link
Contributor

@nameisbhaskar nameisbhaskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants