Nvidia drivers stop loading after any sort of system update
Sooo i have this weirdo bug where my nvidia drivers stop loading after system updates
I have secure boot setup , and the kargs are there multiple times as you can see in below pastebin
https://paste.centos.org/view/079f3638
So to get nvidia gpu working i have to each time after an update do rpm-ostree kargs --append=rd.driver.blacklist=nouveau --append=modprobe.blacklist=nouveau --append=nvidia-drm.modeset=1 in tty otherwise nvidia drivers do not load
43 Replies
Image is bazzite-nvidia , hardware specs
CPU: Intel core i5 12400f
Gpu: Nvidia geforce rtx 3050
Ram : 16gb ddr4
2 512gb ssd , 2 1tb ssd
rpm-ostree status output
Looks like you are on a several week old deployment, can you do
rpm-ostree update
?
Oh nvm, I read the date wrong
Year-month-day
All goodsoo what could be the cause of this Nvidia drivers stop loading after a update bug?
I dunno
Are you appending them with
rpm-ostree
?have to do it after each update
hmm
otherwise it doesn't load Nvidia drivers
kargs persist for me
yeah i dunno why they don't on my end , probably a bug
might be something with rpm-ostree itself, i'd probably dig around the issues there
it just lost the kargs again š©
time to add them again
and after doing that + a reboot its back to working again
but having to do it basically after any update is a really annoying thing....
@EyeCantCU https://github.com/ublue-os/bazzite/pull/398
GitHub
feat: Always check kargs by KyleGospo Ā· Pull Request #398 Ā· ublue-o...
Based on user reports of nvidia kargs sometimes disappearing, let's just test them every boot and run less important stuff only on update
hopefully this pr helps:) gonna see when it lands in a update
should be landed now
if you update
alright will update tomorrow, thanks
already got pc off for today
Awesome. Really hoping this addresses it. Albeit... this is very weird behavior. It's like the check we have isn't working yet again
just ran the update and yea seemingly this fixed it
kargs now stay there
thanks
maybe install
akmod-nvidia
and it will work. it worked for me for some weird reasonI'm seeing the kargs persist, but for some reason nvidia-fallback is running even though under kinoite-nvidia nvidia-smi shows the GPU, with bazzite-nvidia nvidia-smi fails because nouveau loaded due to nvidia-fallback
I'll take a closer look here in a moment. Bewildered because it's different for everyone across different systems
I wouldn't suggest installing the akmod
yeah, I tried and it fails due to some package conflicts, but I'm doing something crazy and disabling the nvidia-fallback to see whether it does eventually load the nvidia modules since I see later in the boot process that it tries a few different times but never detects the nvidia because nouveau had already bogarded it
crazy, nouveau actually loaded EARLIER in the dmesg vs when the fallback service was trying to run
I think I fixed it, or at least found a way for it to properly switch from the kernel embedded nouveaudrmfb to Nvidia without getting stuck, appending nouveau.modeset=0 to the kargs let me open Konsole and not get the warning about running nouveau and nvidia-smi works
all credit to this post, https://askubuntu.com/a/1256640
Ask Ubuntu
How do I disable the "Nouveau Kernel Driver"?
I'm trying to install proprietary nvidia graphics driver I downloaded from nvidia website. It will not install because it says that the "Nouveau kernel driver" needs to be disabled first.
I opened
Interesting - i discovered this is the case with my bazzite-nvidia box as well š¤ Its falling back and I have to
modprobe nvidia-drm
to get the resoultion going. I'll follow up - but did just do an rpm-ostree update post the 9/16 pause (unlucky sync date) and will try to get the system back in alignment with this thread's help. šāāļøso the really weird thing is it isn't doing it when I installed with SecureBoot enabled, and reinstalling with SecureBoot disabled it didn't fail to start nvidia again, going to try deleting the enrollment keys and see if installing without SecureBoot causes failures until the SecureBoot stuff is setup
mine wasn't allowing modprobe nvidia or modprobe nvidia-drm because nouveau had already claimed the device, but this fix has it working when it was broken, after multiple fresh installs I'm not seeing the issue again which is weird
oh fun, after a fresh reinstall with SecureBoot disabled and attempting to manually reset all the SecureBoot keys via the BIOS, it boots but shortly after the Bazzite Portal launched after logging in the first time, the screen blanked and only occasionally comes out of it to give me the unlock prompt, but unlocking just goes to a blank screen again.......
full poweroff and boot and it finally let me log in again, but the Bazzite Portal didn't launch because it thought it was done or had been "seen" even though I never got to select any options
ugh, and multiple reinstalls and I can't reproduce the problem, going to try installing the supergfxswitcher as that might be the only thing I added the very first time with Bazzite Portal that I hadn't done in these last couple of reinstalls
I have a 1070 and it simply refuses to use anything but nouveau.
This is even still present after a fresh install of bazzite-nvidia (latest branch and 38 both act the same)
If I blacklist nouveau it works fine, but any sort of update requires me to blacklist it again.
Something is clearing out the kargs I set with the rpm-ostree kargs --append command.
Should I just script it to redo the kargs on every boot?
Should work that way now
What kargs are you applying?
Rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
Can you give me...
systemctl status bazzite-hardware-setup
Just a sec, gotta get discord up and running on that system.
[t9999clint@fedora t9999clint]$ systemctl status bazzite-hardware-setup.service
ā bazzite-hardware-setup.service - Configure Bazzite for current hardware
Loaded: loaded (/usr/lib/systemd/system/bazzite-hardware-setup.service; enabled; preset: disabled)
Drop-In: /usr/lib/systemd/system/service.d
āā10-timeout-abort.conf
Active: active (exited) since Sun 2023-10-15 14:48:45 MDT; 2h 42min ago
Process: 1039 ExecStart=/usr/bin/bazzite-hardware-setup (code=exited, status=0/SUCCESS)
Main PID: 1039 (code=exited, status=0/SUCCESS)
CPU: 136ms
Oct 15 14:48:45 fedora systemd[1]: Starting bazzite-hardware-setup.service - Configure Bazzite for current hardware...
Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Current kargs: rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 rd.luks.options=disca>
Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Checking for needed karg changes (Nvidia)
Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: No karg changes needed
Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Hardware setup has already run. Exiting...
Oct 15 14:48:45 fedora systemd[1]: Finished bazzite-hardware-setup.service - Configure Bazzite for current hardware.
lines 1-15/15 (END)
kargs: rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
No karg changes needed
how about on a boot where the kargs reset?I'm copying a bunch of games over atm, when it's done I'll reboot a few more times and get the log from when it breaks
ty
it did it again, but the kargs haven't been changed. it gave me the same response to the hardware service.
Each time it does this it seems to lock up durring reboot and I have to hard power it off.
maybe it's ignoring my kargs because it seen the driver crash or something. Not allowing the nvidia drivers to run again till I regenerate initramfs or something
Might have a potential fix for this
I'll ping you on it
@t9999clint check the fix I mentioned with the nouveau.modeset=0 as an additional karg, that fixed it for me when I got into that hell where it falls back to nouveau even with everything blocked
@t9999clint this one and see the link on the next line to why it works
When was this change made? Cause I haven't had it happen for a few days now...
we moved the nvidia stuff to initramfs
should be a lot more reliable
that was last week(ish)
Okay, then that probably did fix it.
It's still locking up on reboot sometimes, but that might just be a hardware issue. I'll keep troubleshooting it.
Thanks for the hard work
where was it before?
like what makes it more reliable?