Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve update.sh to reduce manual steps #86

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

0xdevalias
Copy link
Contributor

While making #85 to solve #84 I noticed there were some manual steps required to use update.sh, and wanted to try and automate some of them.

Comment on lines +100 to +117
# This script appends a new version to versions.txt and ensures it remains sorted.
# Steps:
# 1. Append the new version to versions.txt.
# 2. Temporarily transform stable versions (those with no dash, e.g., "3.0.0")
# by appending "-zzzzzz". This ensures they sort *after* any lines containing
# suffixes like "-preview", "-rc", or "-pXYZ".
# 3. Sort the file uniquely (-u) with "-" as the field separator (-t-):
# -k1,1V sorts the main version (e.g., "3.0.0") as a version,
# -k2,2V sorts suffixes (e.g., "preview1", "rc2") as versions,
# so the transformed stable lines appear last in their version group.
# 4. Remove the temporary "-zzzzzz" suffix from stable versions.
# 5. Replace the original file with the sorted result.
echo "$version" >> "$ruby/versions.txt"
sed 's/^\([0-9][^-]*\)$/\1-zzzzzz/' "$ruby/versions.txt" \
| sort -u -t- -k1,1V -k2,2V \
| sed 's/-zzzzzz$//' \
> "$ruby/versions.txt.tmp"
mv "$ruby/versions.txt.tmp" "$ruby/versions.txt"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -zzzzzz hack here is to ensure that proper release versions sort after their -preview/etc patch versions; which seemed to be the main pattern in the file.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little obtuse. Is there an easier way to sort versions by length or emulate GNU's sort -V, so that 1.2.3-preview1 comes before 1.2.3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There no doubt is, though this was the best I was able to come up with at the time while trying to stick to using sort + keeping the same sort order as currently exists.

Ideally I would just use sort -V for version sort, but then that results in the raw 3.0 version being sorted before the -preview/etc versions:

echo -e "3.0.0-preview\n3.0.0\n3.0.0-rc1" | sort -V
3.0.0
3.0.0-preview
3.0.0-rc1

Iterating with Claude attempting to simplify came up with this:

echo "$version" >> "$ruby/versions.txt"
awk '{ 
    # For versions without a suffix, assign a high-value suffix for sorting
    # but only for comparison purposes
    split($0, parts, "-")
    sortkey = (parts[2] == "") ? $0 "~" : $0
    print sortkey, $0
}' "$ruby/versions.txt" |
    sort -uV |           # Sort by the sortkey
    cut -d' ' -f2 |      # Keep only the original version string
    tee "$ruby/versions.txt"
echo -e "3.0.0\n3.0.0-rc1\n3.0.0-preview1\n3.1.0" |
    awk '{ split($0,parts,"-"); sortkey = (parts[2]=="") ? $0"~" : $0; print sortkey, $0 }' |
    sort -uV |
    cut -d' ' -f2
3.0.0
3.0.0-preview1
3.0.0-rc1
3.1.0

Iterating with o1-mini came up with this variation, which seems far less hacky:

echo -e "3.0.0-preview\n3.0.0\n3.0.0-rc1" | sort -t- -k1,1V -k2,2r -u
3.0.0-rc1
3.0.0-preview
3.0.0

But when we use a more complete test case, it fails to sort some of the versions correctly still:

echo -e "3.0.0-p547\n3.0.0-p551\n3.0.0-p550\n3.0.0-preview\n3.0.0\n3.0.0-rc1" | sort -t- -k1,1V -k2,2r -u
3.0.0-rc1
3.0.0-preview
3.0.0-p551
3.0.0-p550
3.0.0-p547
3.0.0

Removing the r (reverse) sort of 2nd part makes the main version sort before the -preview/etc again:

echo -e "3.0.0-p547\n3.0.0-p551\n3.0.0-p550\n3.0.0-preview\n3.0.0\n3.0.0-rc1" | sort -t- -k1,1V -k2,2 -u
3.0.0
3.0.0-p547
3.0.0-p550
3.0.0-p551
3.0.0-preview
3.0.0-rc1

If you have any ideas of paths to explore, or alternative tools you're open to being included here, happy to try and iterate/refine further.

Copy link
Contributor Author

@0xdevalias 0xdevalias Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemingly Gnu sort also has these sorting semantics of the labelled versions coming after the major version with -V:

echo -e "3.0.0-p547\n3.0.0-p551\n3.0.0-p550\n3.0.0-preview\n3.0.0\n3.0.0-rc1" | gsort -t- -k1,1V -k2,2 -u
3.0.0
3.0.0-p547
3.0.0-p550
3.0.0-p551
3.0.0-preview
3.0.0-rc1

@0xdevalias 0xdevalias mentioned this pull request Jan 19, 2025
@postmodern postmodern self-requested a review January 20, 2025 19:32
Copy link
Owner

@postmodern postmodern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed a bunch of things. Think there's two problems here that could become two separate PRs: sorting versions.txt and updating stable.txt.


# 2) Remove duplicates (keeping the first occurrence) without sorting
awk '!seen[$0]++' "$cwd/$ruby/checksums.$algorithm" \
> "$cwd/$ruby/checksums.$algorithm.tmp"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to remove duplicates from the checksum files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly added this with the idea of making running update.sh idempotent when the version had already been added.

I originally thought about doing a version of sort for it instead, but that came with a bunch of extra caveats, as I highlighted in:

Instead of using awk here to avoid duplicates, I considered using a similar sorting method as below; but it changed the default sort order of the file extensions/etc; and I didn't want to make too much diff churn here.

Also, currently this file seems to just have versions appended when they were added, not sorted by version number.. so I wasn't sure if that was just because it was easier or because we wanted this file to be roughly ordered by 'release date' rather than version number.

Originally posted by @0xdevalias in #86 (comment)

Comment on lines +100 to +117
# This script appends a new version to versions.txt and ensures it remains sorted.
# Steps:
# 1. Append the new version to versions.txt.
# 2. Temporarily transform stable versions (those with no dash, e.g., "3.0.0")
# by appending "-zzzzzz". This ensures they sort *after* any lines containing
# suffixes like "-preview", "-rc", or "-pXYZ".
# 3. Sort the file uniquely (-u) with "-" as the field separator (-t-):
# -k1,1V sorts the main version (e.g., "3.0.0") as a version,
# -k2,2V sorts suffixes (e.g., "preview1", "rc2") as versions,
# so the transformed stable lines appear last in their version group.
# 4. Remove the temporary "-zzzzzz" suffix from stable versions.
# 5. Replace the original file with the sorted result.
echo "$version" >> "$ruby/versions.txt"
sed 's/^\([0-9][^-]*\)$/\1-zzzzzz/' "$ruby/versions.txt" \
| sort -u -t- -k1,1V -k2,2V \
| sed 's/-zzzzzz$//' \
> "$ruby/versions.txt.tmp"
mv "$ruby/versions.txt.tmp" "$ruby/versions.txt"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little obtuse. Is there an easier way to sort versions by length or emulate GNU's sort -V, so that 1.2.3-preview1 comes before 1.2.3?

update.sh Outdated
echo "$version" > "$ruby/stable.txt"
if [[ -f "$ruby/stable.txt" ]]; then
stable_file="$ruby/stable.txt"
version_family="${version%.*}" # Extract major.minor from version (e.g., 3.3)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this ignore preview/rc versions? stable.txt should only contain stable release versions for each version family.

Copy link
Contributor Author

@0xdevalias 0xdevalias Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In it's current form, seemingly not. It also looks like it may even downgrade the version listed there.

⇒ ./update.sh ruby 3.4.0-rc1
Already downloaded ruby-3.4.0-rc1.tar.gz
Already downloaded ruby-3.4.0-rc1.tar.xz
Already downloaded ruby-3.4.0-rc1.zip
Updated ruby/stable.txt to 3.4.0-rc1
- 3.4.1
+ 3.4.0-rc1


# Use sed to replace the version for the major.minor family or append it if not found
if grep -qE "^${version_family}\." "$stable_file"; then
sed -i '' -E "s/^(${version_family}\.).*/$version/" "$stable_file"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • First time I've seen -i '' instead of just -i.
  • -E might not be necessary here?
  • The . in the version_family will be interpreted as a regex any-character, instead of \..

Copy link
Contributor Author

@0xdevalias 0xdevalias Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First time I've seen -i '' instead of just -i

When I first ran it with just -i it seemed to be trying to use the -E part in its backup filename. Not sure if I was doing something else wrong, or if this is an intricacy with the macOS flavour of sort or similar.

-E might not be necessary here?

Valid.. can definitely test without and remove if not needed.

The . in the version_family will be interpreted as a regex any-character, instead of \..

Makes sense 👌🏻

Copy link
Contributor Author

@0xdevalias 0xdevalias Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With sed -i -E "s/^(${version_family}\.).*/$version/" "$stable_file", it creates the file: ruby/stable.txt-E

With (GNU sed) gsed -i -E "s/^(${version_family}\.).*/$version/" "$stable_file" it seems to work fine.

With (GNU sed) gsed -i '' -E "s/^(${version_family}\.).*/$version/" "$stable_file" it gets an error gsed: can't read s/^(3.3\.).*/3.3.6/: No such file or directory

So I think that weird syntax of -i '' is a macOS compatibility intricacy; but seemingly not universal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed -i '' "s/^(${version_family}\.).*/$version/" "$stable_file" seems to work fine without the -E

gsed -i "s/^(${version_family}\.).*/$version/" "$stable_file" also seems to work fine without the -E

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because -i takes an optional inline suffix.

       -i[SUFFIX], --in-place[=SUFFIX]

              edit files in place (makes backup if SUFFIX supplied)

Maybe try -E -i instead, or try without -E.

update.sh Outdated
echo "$version" > "$ruby/stable.txt"
if [[ -f "$ruby/stable.txt" ]]; then
stable_file="$ruby/stable.txt"
version_family="${version%.*}" # Extract major.minor from version (e.g., 3.3)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version_family is already set above, but only for ruby="ruby". I like version_family="${version%.*}" and think it should be done for all rubies, not just ruby.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemingly version_family="${version%.*}" has an edge case, given 3.3 will only return 3. If that's fine, then can leave it as is; otherwise this would make it more robust:

if [[ "$version" =~ ^[0-9]+\.[0-9]+\.[0-9]+ ]]; then
  version_family="${version%.*}"
else
  version_family="$version"
fi

Another more robust alternative (at the risk of some complexity) would be to use regex to extract the version number parts directly:

if [[ $version =~ ^([0-9]+)\.([0-9]+)\.([0-9]+)(-([a-zA-Z0-9]+))?$ ]]; then
  version_major="${BASH_REMATCH[1]}"
  version_minor="${BASH_REMATCH[2]}"
  version_patch="${BASH_REMATCH[3]}"
  version_label="${BASH_REMATCH[5]}" # This is optional and might be empty

  echo "Major: $version_major"
  echo "Minor: $version_minor"
  echo "Patch: $version_patch"
  echo "Label: ${version_label}"
else
  echo "Invalid version format"
fi

Copy link
Contributor Author

@0xdevalias 0xdevalias Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the simple but less robust version of this change in d2cf5e7 (#86)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemingly version_family="${version%.*}" has an edge case, given 3.3 will only return 3. If that's fine, then can leave it as is; otherwise this would make it more robust:

update.sh expects a fully qualified version, not a shortened version (3.3).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. We should add an explicit check for that at the start then + provide feedback to the user if something else is passed. Then we can use the simple latter check knowing the edgecase won't get hit.

@eregon
Copy link
Collaborator

eregon commented Jan 28, 2025

Please check that it still works for other Rubies than CRuby after your changes, e.g. with ./update.sh truffleruby 24.1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants