Ran into two issues we d like to offer
Ran into two issues, we'd like to offer mainline kernels and device specific kernels for the ally
113 Replies
We can talk whenever... but I have some thoughts...
I think you just need a copr no?
Pre-conditions I assume...
whatever custom kernels are required are available in some yum repo already (out of scope for akmods)
i'm sure i have other assumptions but my brain... not sure
i don't know how many custom configs you are wanting, but...
i could imagine building custom
fedora-ostree-desktops/base
images with the various swapped out kernels... as a matrix (parallel jobs) where said "base" images all get proper image/kernel version metadata assigned
then, build existing kmods similar to we do now, but with the new kernel/base-image stuff as an input for the Containerfile to build against the proper kernel
... thats kind of the basic idea which I figure is already in mind
The problems I'm already starting to think about (and @akdev ) even yesterday/last night...
i think our process is inefficient at best. both for akmods and main repos
so i think we need some architectural/process improvements for the builds before tackling more akmods (and kernel flavors of them) seems reasonable
maybe the problems of main and akmods are a bit different, but my head is in main
at the moment, so here's my thought there, and i think it could be relevant to how we could solve multiple kernel-images here
context is main
repo, which now builds both *-main
and *-nvidia
images:
each build in the (massive) matrix does:
1. calculate tags for inputs
2. build foo-main image
3. build foo-nvidia image
4. publish, sign, etc (if not PR)
this is true for EACH runner, so there's a lot of double effort, and... its because we don't publish the intermediate images unless we are NOT in a PR build... which makes it a bit tricky for foo-nvidia
to depend on foo-main
in a distinct job/step of the build workflow
an idea for solution:
split the push-ghcr
job into 2...
build-main
and build-nvidia
given they'd need distinct Containerfiles
i wonder if we could in build-main
steps:
1. calculate tags
2. build foo-main
image
3. export the image to a tarball (podman export
) and store in a defined path (details fuzzy here)
4. use github cache to cache that path with unique key
5. publish, sign, etc (if not PR)
then... in build-nvidia
steps:
1. calculate tags
2. podman import
the foo-main
image from cached tarball
3. build foo-nvidia
image
4. publish, sign, etc (if not PR)
IF that approach is viable, i think we could also use it in akmods for custom-kernel builds which would need to get used for akmods...
and there's probably some more stuff that could be done to streamline there as well.
</crazy idea>
more crazy idea...
in akmods this approach could be helpful even today, without custom kernels, as we could stage a cached "prepared base" image for use by the 3 "common" "nvidia-470" and "nvidia-535" builds so they don't all have to do that work.
all this said, I really think attempting to implement that last use case first might be easiest.
akmods buils are much smaller than main now...
the relevant changes to workflow model can be tested in a PR branch build.
and if it works, it could be immediately extended for custom kernels with some matrix magic
and... i'm going back to my day job... but i'll probably reply to a ping in here 😉I’m gonna need time to process this 😂
Same here, but this is great
Thanks for sharing your thoughts on this
(edited my list of steps above to include
podman export
and podman import
and to say tarball
since that matchs up with man pages)Okay I’m going to put my first thoughts here, but I haven’t analyzed Sherman’s thoughts too deeply.
1. Ideally we would have one container per kmd rpm package to avoid download overhead from the consumers
2. Custom kernels would be out of scope for main imo - it’s supposed to be close to fedora. I don’t really think we need to create base images for custom kernels necessarily, I think we can rpm-ostree override the kernel package no?
3. akmods repo could build a matrix of kmod + kernel versions - this would work as long as we support pulling rpms from custom repos which seems easy to do
4. From client perspective they would do stuff like:
This is 100% complementary to my idea.
If we had per-kmod images (eg ghcr.io/ublue-os/akmods/xpadneo) we could even tag them in a way that is easy to consume by consumer
eg,
xpadneo:37, xpadneo:38, xpadneo:39
would be latest stock fedora kernels from the upstream images
xpadneo:ally123
latest build for some kernel specific to ally
xpadneo:customkern2
latest build for some other custom kernel
all that said, i see splitting kmod builds into a matrix and splitting up the resulting images to be the easy bit which first requires the hard work of building cached "pre-prepared, ready-to-build a kmod base" images with distinct kernelsI think the process for that is very simple no? Install the akmod stuff and then swap the kernel to the custom one
Oh I see
I think the kernel base images should be in the repo where the custom kernel is built?
hmm... i'm trying to think how to explain this better
I think I got your point
We need to build many kmods for any given kernel version
And downloading the kernel each time is wasteful
We could prepackage these images once before the builds start
and installing even build deps each time is wasteful (there is a
build-prep.sh
in akmods
repo which already does this per build today)
yes, or pre-package a "ready-to-build-kmods-on-kernel-XYZ" image per kernel version at the start of each 'akmods' run
i think saying same thingYeah
Two things to figure out then
1: how to make it as easy as possible for people to swap the kernel version, we need to find some way to either reinstall the proper kmods or easily uninstall all and reinstall what's needed
2: how to make it easy for people to fork the akmods repo if they need to add additional kernels
Downstream that is
i don't see this as being easy, in fact the more we add to akmods and main repos, the more complicated it becomes
i wrote "rebase" before, but i deleted to write a more complete response.
i do think swapping kernels for bazzite specifically can be a rebase which can be incorporated into yafti/installer/whatever-you-want
unistall/reinstall... other than rebase... uh... that's a different problem...
if we want users to be able to layer kmods which are built in akmods, that's a problem which we already have, so i think it's just a question of what to tackle first
Ugh... the cache idea may not work
A repository can have up to 10GB of caches. Once the 10GB limit is reached, older caches will be evicted based on when the cache was last accessed.per: https://github.com/actions/cache#cache-limits I think instead of "cache" what I really want is "artifacts" ok, yep... artifacts work a treat, this is exactly what i wanted when i was talking about cache earlier and it's what they are meant for 🤦♂️ https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts#passing-data-between-jobs-in-a-workflow i did a test run where i build an image, save it to a tar file, upload, download, load it, and build FROM it https://github.com/bsherman/cache-tests/actions/runs/6032455190/job/16367660838
this is NOT ready yet, but i'm working on it: https://github.com/ublue-os/main/pull/325
GitHub
feat: use artifacts to streamline image builds by bsherman · Pull R...
The goal of this PR is to streamline image builds.
The idea is to split *-main and *-nvidia image builds into 2 jobs.
The *-main job will run first, and each matrix job instance will build the foo-...
ugh, the artifacts is feeling pretty painful, these tarballs are so big
This is a different attempt ... still not happy
https://github.com/ublue-os/main/pull/326
GitHub
feat: split main/nvidia into distinct steps within job by bsherman ...
The goal of this PR is to streamline image builds.
Current problem:
both *-main and *-nvidia images are built once each for every permutation of a matrix comprised of: fedora major version (37, 38)...
I think we may just need to use the registry for this. and then my artifacts PR https://github.com/ublue-os/main/pull/325 (but using registry push instead of artifacts) and then we'd have much cleaner builds here.
but... there's the whole security concern... how can we mitigate the security concern?
- push, but don't sign, a PR build?
- use a distinct-from-normal image name (that would get challenging, I think)
- use PR tags only (this already happens, except in the case of bugs in workflow)
Using artifacts will probably be very slow
this may sound unrelated to the original akmods question, but it's not because it's trying to solve the basic problem of large-ish matrix builds with multi-gig shared artifacts (container images)
that's what i found above 😉 yes
At least in my testing with isogenerator it was very slow
I'm going to talk about the recent question in #💾ublue-dev here... @j0rge and @KyleGospo
he's talking about the akmods change you were making for building them against different kernelskeeping this in context 🙂 https://discord.com/channels/1072614816579063828/1072617059265032342/1154807480531030067 So, Jorge is asking about various kernel support for kmods, and I had started work to reduce the amount of duplicated build time on main & nvidia as a proof of how we'd do this for akmods. I got stuck because: 1) we can't use cache (our images would have to be exported and are too big and I think it would be too slow anyway) 2) we can't use artifacts (again, our images would have to be exported and it ends up being crazy slow) this leaves us needing to build and push intermediate images to a registry (ghcr) as the only viable "intermediate cache" solution, but... there has seemd to be a consensus that this is insecure/unsafe. I think the only way to move forward is to allow a least SOME intermediate images to be built and pushed to ghcr ... we can use some more obscure name and not sign them maybe? though that may be confusing... essentially, i think what @EyeCantCU was trying to accomplish with his PRs around this topic is more or less required to move forward here
what about rolling them as offiically supported releases? ASUS and Surface Kernel are both pretty common and we have ublue-os/asus and ublue-os/surface now
then it's just a matter of running the build 3x with a different kernel installed at the beginning
i'm concerned about the dependancy sprawl
not sure how you are envisioning this
matrix:
image_flavor: [main, surface, asus]
and then a conditional in the containerfile to either do nothing, or install the desired kernel prior to building the kmods
and then that'd change the built image name, so downstream you'd just pull the one you need
i see, and it seems fair... but wow... another massive explosion in the number of images
and so much duplication of effort
build time effort, that is, time
yeah, though these are different kernel versions than main so no matter what that build time is duplicated
asus is a couple point versions out of date
yeha I do want to start trimming some fat, once f39 is out we'll remove 37, and then ones like budgie can go away
no one picked up kera so I'm going to archive it
i'm trying to wrap my head around this:
so, why not have a new "base" (hate the name since its overloaded, but...)
do all most of the existing build work there... for all variants (silverblue, kinoite, etc...)
then, for the flavors (main[default], surface, asus), we matrix for custom kernels and kmod installations
though, i don't think we can submatrix
GitHub
GitHub - ublue-os/surface: WIP - OCI Images derived of ublue-os mai...
WIP - OCI Images derived of ublue-os main images for Surface hardware - GitHub - ublue-os/surface: WIP - OCI Images derived of ublue-os main images for Surface hardware
GitHub
GitHub - ublue-os/asus: OCI Images derived of ublue-os main images ...
OCI Images derived of ublue-os main images for ASUS hardware - GitHub - ublue-os/asus: OCI Images derived of ublue-os main images for ASUS hardware
these are main + those kernels
just lacking the akmods
little different I suppose, these build numerous different images with the asus/surface kernel added
kind of like what nvidia was
yeah, so i'm feeling like this is pushing us back to nvidia in a distinct repo/build
and we'd want a new
pre-main
repo for the base
upon which main/asus/surface are built on
i'm not opposed to re-splitting... i think we prematurely moved nvidia into main repo, hoping for gains, but not realizing how annoying it actually would betbf, it's annoying to have all those builders fire off for everything, but also it's nice to NOT have to remember to go click on nvidia. And now we're not on 20 builders so at least it builds them all in one pass, and that'll go away in november when we remove 37, so we're kind of in the worst case scenario right now
but we just added asus and surface... so have to rememeber to click those builds, why not also nividia?
yeah
having config and akmods as distinct repos is one thing... but all these actual, runnable image builds ... i feel like they should be split repos or a mono repo, but not both
but triggering builds is a thing we need to figure out anyway
I don't have a strong opinion on split or mono, but from an org perspective it would be nice to have the delineation for division of work
yeah, that's exactly how i feel
like, if surface breaks then we're not worried about "main is red!"
and like, after setup the build trigger thing can be cronned, we just happen to be real busy right now
like I adjust the crons to be "in order", and that's like, probably fine?
and to be clear, i'm not upset we merged nvidia into main, i think it was a good experiment, from which we have learned... but i think the lesson is we probably want the delination
wait
until we can trigger, yes
wasn't it your idea?
lol
that's ok I'm not upset when bazzite breaks either
no, you thought it would be more efficient for builders 😄
at least, i think
😄
yeah, but I also throught all the builder stuff would be done a while back
i DID want to move nvidia kmod into akmods, and was/am very happy with that part
CI is non-trivial... especially with all these permutations of dependancies, and also trying not to change image names 😄
ok so an iterative win. SGTM heh
<:beagle_love:1087436386036092950>
ok, so, if our consensus here is we should move nvidia back to it's own thing...
i think that takes us back to how to build asus/surface from "pre-main" or "base" or something
hmm...
Catching up. Would base just be like main but without akmods?
let's not call it base even now because we'll confuse ourselves. 😄
what if we had
main
images as they are today but with NO kmod additions, to minimize changes for downstream users
but added a new hwe
image set that's main
+kmods on the default kernel
asus
and surface
are main
+kmods from their specific kernelsWe could do base -> {main, surface, asus, framework}
Oh that works too
i think the
hwe
idea also has some merrit because we've had some gray area around when kmods should end up in main
or not... so potential downstream users could have main
as a base even if they don't want our kmodsI like it
I like that idea
ok, this also seems much more realistic to accomplish then some mega-mono-repo ... and I think i can tackle it today
That's awesome. Thank you for looking into it
One thing though
that's a great idea
to be clear, i'm talking about tackling the reorg of
main
and hwe
and nvidia
and i'm assuming nvidia
is built on hwe
for nowThat means other images will need to also produce hwe images
and it would stop people bugging us about "what if I want a image with less bloat that doesn't work?"
Which I have no problem with. But 2x the images for Bazzite and Bluefin
nope
we would only do hwe
for bluefin I mean
Okay
Oh WAIT
I SEE
maybe? i'm suggesting that our main is really a
base
for use by anyone... but opinionated downstreams, even asus/surface and definitely bluefin/bazzite will use hweThey'd be built out of asus, framework, and surface
So it wouldn't matter
or do it the other way around, don't call the ones with the goodies hwe, call the plain ones something else
base and main?
raw
silverblue-main-suckless lol
base
lol
nokmods
yeah, that's fine
nokmods
ok
let's not overthink it
ok, so I'll tackle the reorg of main and nokmods and nvidia and i'm assuming nvidia is still built on main (not nokmods) for now...
then downstreams should have no changes to expectations
i can do this today, but this does NOT actually make changes to akmods repo to support other kernels... that's a distinct, though related, effort
Nvidia but with no Nvidia kmods ;_;
boneless
I feel like this at least needs an issue
yeah just whack a paste into github so we don't lose this context
100%
and i'll create the related issue for akmods too
If I get another nouveau question in bazzite, I swear...
does the rog ally work yet?
j/k
ugh
i can't shake the feeling that peeps are rebasing a lot and not knowing what image running, and sometimes that's why they are on nouveau not nvidia
I ask for rpm-ostree status for that alone
And don't get it half the time
it just does NOT fail to load randomly on a reboot
ok, i gotta drive home, then i'll do github issues and start work
Sounds great!
I keep this around, when people ask in chat, send them to the github docs:
That's actually a good idea
I want to improve hw compatibility with all these devices but sometimes word of mouth isn't enough
well, I'm hoping to get on the nerd nest podcast
that should be fun
That'll be cool
@KyleGospo actually if you know GE already ask them if they'd be down to do that podcast with you together
@j0rge I do not
could try via discord
framework, asus (ally), and surface are always intel/amd not nvidia, right?
I believe that's currently true, yes
Only case otherwise would be egpu
that's pretty edge case... i don't want to worry about egpu right now LOL
Hold on I take that back, I'm looking at Asus from a handheld standpoint
They have plenty of laptops with Nvidia graphics that need those same kernel patches
yeah, that's what i thought
I think that's the only case here
oh crap
ok, well, it doesn't change the image build hierarchy
i like what we've laid out here... i'm getting it down in an issue now
GitHub
Reorganize main builds to better support image flavors: main, nokmo...
We previously completed #313 which merged *-nvidia image workflow into this main repo. But since then, we've noticed some excessive build times and duplication of builds due to the chained natu...
and for akmods: https://github.com/ublue-os/akmods/issues/68
GitHub
Create build matrix to support distinct builds for kernel variants ...
In order to support custom kernels (eg for asus and surface images) we need those kernels to be used as part of the kmods build process. Perhaps we do something like this: matrix: kernel_flavor: [d...
Surface has laptops with Nvidia
The Surface Book 2 and 3 iirc
cool
@EyeCantCU / @KyleGospo related to
nokmods
i think there's some details other than just our kmods... maybe?
won't packages like intel-media-driver
, kernel-tools
and kernel-devel
which are currently in the main packages.json need to be ommitted?Unsure about Intel media driver but we should be able to omit the other two
my point is, i think we'd need to, otherwise the asus/surface images will possibly fail to replace those packages as they are part of an oci image not just the ostree layers
Ah, okay. That makes sense. We'd likely have to