-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel-install: integration #5135
Conversation
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
74d2f7e
to
04dcac2
Compare
todo here:
|
Joseph and I were looking at #4950 - I'd forgotten about that change when looking at this one. This is all quite messy because we need to handle 2 different cases with the updated rpm-ostree (i.e. after this PR).
In the first case, we must just go into our existing kernel handling and not assume the real
I would lean to unconditionally doing the first...we shouldn't try to have it two different ways in the second. Basically let's test stuff well, and then pull the trigger on the base image side and always go through the real |
04dcac2
to
3a4f740
Compare
52ec043
to
ca3d005
Compare
It looks like I need to add a conditional to the scripts module too. |
That output looks like we're missing the |
9857c0f
to
a365259
Compare
a365259
to
2a30ea2
Compare
OK tested with coreos which does not have layout changes and with a build of bootc image with my MR with the layout=ostree changes and this current iteration works on both images, generates the initramfs and removes the old kernel. |
I had messed up the core.rs changes by checking for OSTREE_BOOTED on top of the layout=ostree presence. This made the changes Jonathan did to not require users to set the cliwrap not work at all. |
c232c3b
to
a560bfa
Compare
rust/src/kernel_install.rs
Outdated
pub fn is_ostree_layout() -> Result<bool> { | ||
let install_conf = Path::new(KERNEL_INSTALL_CONF); | ||
if !install_conf.is_file() { | ||
println!("can not read /usr/lib/kernel/install.conf"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this as a tracing::debug
rust/src/cliwrap.rs
Outdated
@@ -30,6 +32,9 @@ static WRAPPED_BINARIES: &[&str] = &["usr/bin/rpm", "usr/bin/dracut", "usr/sbin/ | |||
/// Binaries we will wrap, or create if they don't exist. | |||
static MUSTWRAP_BINARIES: &[&str] = &["usr/bin/yum", "usr/bin/dnf"]; | |||
|
|||
/// Binaries we will wrap only if ostree layout is not specified. | |||
static NON_OSTREE_LAYOUT_WRAP_BINARIES: &[&str] = &["usr/bin/kernel-install"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this but it seems unlikely we ever add more to this array too; we could just reference const KERNEL_INSTALL: &str = "usr/bin/kernel-install"
We were chatting about testing this with kernel-rt and crafted this hacky invocation: |
If I understand correctly, I should now close https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2743 ? |
Yep |
This introduced a bug where at compose time we will always wrap kernel-install. Since it was at compose time the testing I did layering the new rpm-ostree before installing a kernel did not catch this. Will double check this fixes the issue and do a new release shortly. |
Also fallout in FCOS build which is probably the same but we should investigate
Same symptoms. |
Right. Just noting here that the |
It's worth noting that the rawhide run Tuesday succeeded and the one today failed, but Also worth noting that since Monday COSA has had |
Hmm in that tuesday run though I don't see the |
It's in there in the
|
This has definitely leaked into fcos rawhide prod builds:
And yeah rpm-ostree-2025.1 is definitely in cosa, so it's going to leak into all builds |
To explain a bit I think the problem is that the first time we leak the wrapper it's not going to fail. It's only going to fail the next build |
I don't fully grok why that is true, but that would explain why the Tuesday |
@dustymabe mind sanity checking for both of us that pulling in the latest rpm-ostree (e.g. from copr) fixes this for you i.e. the target system doesn't have wrapped kernel-install? |
I can tomorrow, but I really need to understand the problem better so I can properly test it. i.e. the first build success/second build fails thing I'll have to dig into. Should I put the new rpm-ostree in COSA? Can you link me to the copr repo to pull from? |
Yeah
https://copr.fedorainfracloud.org/coprs/g/CoreOS/continuous/ |
I tested rpm-ostree-2025.2-1.fc41 in a local COSA and the |
Thanks! The extra reassurance is helpful because Joseph wasn't seeing the fix work; we were pretty sure we traced that to an incorrect caching, but this one is so subtle that extra sanity checking is helpful. |
TBH I'm still not completely sure I understand all the pieces to the puzzle here. One really confusing part is that f42 bodhi tests have been passing the last day or two and these should be using all the same tools (i.e. latest COSA), but just freezing on:
I just don't understand how those are passing, so the problem is somehow more nuanced and I think we need to get to the bottom of it so that we can fully ensure that our prod streams (which were built with the "bad rpm-ostree in COSA" on Tuesday) don't see regressions. |
Some more info that might or might not be useful.. between a "good" and "bad" build here's the diff:
|
Everything except the kernel-install are files that are unreproducible today. Though after #5244 the rpmdb may drop out...and we could probably fix cacerts with https://github.com/keszybz/add-determinism/ and that would just leave the initramfs which needs some special handling. |
Right. I know some files are expected (for now) to change between successive builds. I'm more interested in the ones we aren't expecting to change. Does that info help the investigation? |
I tested this again this morning with the bootc rawhide image that has the new rpm-ostree and we no longer are always wrapping kernel-install. The change this feature added is a new path for when ostree=layout is set, we won't wrap kernel-install instead rely on kernel-install deferring the logic to rpm-ostree. |
@dustymabe @jmarrero and I got together to discuss this, and the reason the Tuesday run passed but not the Wednesday one is that we inherited https://gitlab.com/fedora/bootc/base-images/-/merge_requests/62/diffs in between. Which means that the Tuesday run was using the old path, and the Wednesday run the new path (with the bug). We also determined that while we did leak wrappers in the latest production releases, it's harmless without the |
Circling back to this. The above statements are true, but what did happen between Tuesday and Thursday was that this change merged into FCOS |
Should be kept in sync with [1] until we can rebase on the bootc manifests/images. [1] https://gitlab.com/fedora/bootc/base-images/-/blob/main/tier-0/kernel-install.yaml See: https://gitlab.com/fedora/bootc/base-images/-/merge_requests/62 See: coreos/rpm-ostree#5135
incorporates #5097
closes: #4726
This will pair with: https://gitlab.com/fedora/bootc/base-images/-/merge_requests/62 which makes the integration work.
This builds and works on the following Containerfile.
Need to figure out how to properly deal with the commented out code still.