Nvidia dgpu laptop problems
Hi! I am having issues with nvidia-powerd.service failing to start after rebasing to bazzite-nvidia from silverblue-nvidia, I attached the output of
systemctl status nvidia-powerd.service
, any ideas how I should start with trying to diagnose this problem? Its a mobile RTX 2060.
An nvidia-powerd.service error message that was also appearing on my laptop's screen every boot, which was how I knew to check, if you are wondering.83 Replies
If you're using secure boot, have you enrolled the key?
Powerd failing can be normal, not all GPUs support it
Make sure the Nvidia kernel module is loaded
not currently using secure boot, would you recommend enabling it? also when looking for how to check if the Nvidia kernel module is loaded, I discovered this
cat: /proc/driver/nvidia/version: No such file or directory
Hmm, are there any issues besides powerd? What's your output for
modinfo nvidia
here you go
That looks good at least
Let's try uhh
systemctl status nvidia-powerd | fpaste
https://paste.centos.org/view/ea541a07
woah, fpaste is new to me! thats pretty neat
rpm-ostree kargs | fpaste
There we go, you appear to be missing all the kargs
Which is weird, those should auto apply
There's a just command do it
nvidia-set-kargs?
Yea
ok, just a minute
https://paste.centos.org/view/e1bdfc08
same output after running the commands and rebooting. hmm
Okay that is extremely weird
You could try running the karg command manually
See if there's any difference
yeah, I ran it, it told me to systemctl reboot, and nothing happened
All right, give me just a bit here I'll poke around and see if I can find anything
I don't believe we ever remove those settings, we only ever add them
This is all, just starting to look like I just did one too many rebases. Especially since I was coming from the fedora 39 beta.
if you think it could be relevant and hard to fix, I could do a reinstall instead of just trying to make rebasing work. Also, if all the secure boot stuff is working, I could reenable that. I only had it disabled because a while back ublue was having secure boot issues.
Yeah feel free to, it should all just work
I’m confused. I’m still getting TPM issues when trying to install from the main image. should I keep TPM and secure boot stuff off during install?
If you can disable one and not the other I would just turn TPM off
Otherwise there's nothing wrong with both off at all
System 76 scheduler can't change your CFS parameters if secure boot is on, so it's probably objectively better on a laptop anyway
Cool. Also, I’m not seeing bazzite in the main installer, is that normal or did I just miss the right menu?
Well, I reinstalled, and the issues aren’t over yet

If that's the first boot that's normal, it should set the kargs and then reboot again with Nvidia loading
We can't set kargs in the installer yet
OK, I’ll get back to you afterwards
[leaf@fedora leaf]$ just enroll-secure-boot-key
sudo mokutil --import /etc/pki/akmods/certs/akmods-ublue.der
input password:
input password again:
echo 'Enter password "ublue-os" if prompted'
Enter password "ublue-os" if prompted
so... not only does the nvidia gpu not seem to work, I can't get the secure boot script to work
I keep entering ublue-os as password, but it just doesn't like me.
That looks right, don't see any errors
When you reboot it'll boot to your bios and enroll the key
Or at least should
@bsherman might need you on this one if you have time
OK, I just thought the message was broken since after entering the password twice it told me what I should’ve entered and then shut down the script.
I just rebooted, and it came back to the desktop. Should I run just bios after running the script again? Or just bios without running the script again
Script again I think
Since once you leave that screen it's done trying to add the key
So, I think the timeline here is:
1. system was running
ublue-os/silverblue-nvidia
2. rebased to ublue-os/bazzite-nvidia
3. nvidia-powerd
service is failing
4. then enabled secureboot and disable dtpm in bios ?
5. then attempted to enroll secureboot keys?Fresh install now
With secure boot enabled
so, if user does not need SecureBoot and TPM, I'd at least start with them disabled in order to rule them out
but before changing anything...
I'd do a few debugging things:
at least with that info i can try to help
gotta step away for a few, but i'll be back
I just disabled secure boot to just eliminate any relevant issues, here are the outputs
First
No text to send
Second
https://paste.centos.org/view/d2c9dcc1
Third
No text to send.
Fourth
https://paste.centos.org/view/00c7514d
Fifth
https://paste.centos.org/view/490a3c72
Sixth
https://paste.centos.org/view/7caeb3ef
I am in no rush, I have another working computer for non-gaming things, so feel free to de-prioritise this issue as needed
Just tell me if you think another clean install with secure boot off could help, cause I am down to do it
First No text to sendI messed up this should have been
rpm-ostree status|fpaste
😄
Second https://paste.centos.org/view/d2c9dcc1not yet showing nvidia specific kargs, but i think that makes sense when i read the chat above, as i think you were having trouble with the mok enrollment
Third No text to send.expected since apparently mok enrollment was not successful
Fourth https://paste.centos.org/view/00c7514dthis is good, shows you are running some derivative of
ublue-os/silverblue-nvidia
or have installed our akmods
built nvidia kmod
this shows that 1) the drivers is on the filesystem 2) it's signed by ublue's key
Fifth https://paste.centos.org/view/490a3c72expected nvidia-powerd failure given the nvidia driver is not loaded
Sixth https://paste.centos.org/view/7caeb3efok, dmesg boot output shows 1) intel CPU and nvidia card are present 2) secureboot is already disabled ... this MAY be related to failure to enroll the MOK... 3) TPM is likely disabled in bios because kernel can't find TPM chip at this stage, I'd do:
just nvidia-set-kargs
please report any output from that here... then reboot... i'd expect the nvidia driver to be loaded
after rebooting could repeat some debugging
ok, I can't for the life of me get notifications to work on this specific chat, so I just saw this. ran the 'just nvidia-set-kargs' command, rebooted and ran the debugging stuff
https://paste.centos.org/view/27729c4d
https://paste.centos.org/view/bee6cee2
https://paste.centos.org/view/20678332
https://paste.centos.org/view/1d2dc929
nvtop still isn't seeing the rtx gpu, and the nvidia-powerd issues persist. I know turned off secure boot before these last two rounds of diagnostics, but I am thinking something might have gone wrong because of me having it enabled during os install. also yeah, tpm is off to avoid weird installer bugs was havng.
I'll ping you more directly 🙂 @leaferiksen
https://paste.centos.org/view/bee6cee2This shows the kargs are not set to blacklist nouveau . did you run
just nvidia-set-kargs
?
i mean, you say you did 😕
was there any output?NVK can't arrive soon enough
was there any output from trying to set the nvidia kargs?
maybe try
just -v nvidia-set-kargs
?oh shit just saw your message
yes, it built the whole image, told me to reboot, and nothing had changed
just nvidia-set-kargs
should not build an image
now i'm curious if that recipie has a bugoh I just saw staging deployment, in my head connected the two concepts
i'm removing the kargs from my system... i'll do some rebooting and test that recipe
ran this, doing a reboot as it suggested now
also, I am unplugging the monitor in hopes of less variables to this puzzle
i'm pretty convinced the problem is you don't have the proper kargs
i don't think you can mis-type
just nvidia-set-kargs
🙂 so i'll test here
i mean, i can mis-type it :-D, but it wouldn't give the output you mentionlol never underestimate the power of users with terrible typing and weird computers
same nvidia-powerd issues after a reboot
it should have done something like this:
ok, i've manually removed my nvidia/nouveau kargs... rebooted... my nvidia drivers were not loading as expected... then run the command above (with that output) and now it's working again...
what are you r kargs now?
rpm-ostree kargs
[leaf@fedora leaf]$ just nvidia-set-kargs
/usr/bin/nvidia-smi
Staging deployment... done
Changes queued for next boot. Run "systemctl reboot" to start a reboot
first time I've ever seen kargs not apply
this is core ostree stuff
right... that's why i tested manually myself... wanted to rule out a problem with the recipe
[leaf@fedora leaf]$ rpm-ostree kargs
rd.luks.options=discard rhgb quiet root=UUID=a3b3db8c-d31e-4c5e-9791-8c2b7773c562 rootflags=subvol=root rw ostree=/ostree/boot.0/default/b3b5fe2f2136685150875a7fda59f73000e92bdad229b8205e65be2909dc4c19/0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
that looks better
yeah, and I didn't get the nvidia-powerd.service errors on boot (not that they have been there with any consistency) but the service is still failing, and nvtop still doesn't see the gpu.
lspci -k | fpaste
plzand
cat /proc/cmdline|fpaste
?I don't know how to reconcile your debugging here...
Here: https://discord.com/channels/1072614816579063828/1155551944555888760/1155646717761433671 it's clear you are booting WITHOUT the kargs which blacklist nouveau
Here: https://discord.com/channels/1072614816579063828/1155551944555888760/1155637710309040158 you show them present in ostree...
This is all very weird. I can’t re-create the successful blacklisting of nouveau drivers. Thank you for all the help, i’ll be trying another clean installation, with secure boot off from the start
you can try it manually
I ran that, rebooted when it told me to, and came back to no blacklist added to the rpm-ostree kargs. My system is broken in a really special sort of way.
Btw, I did a clean install, and things are still being broken. I’ll get back in contact if I ever figure out a fix, in case it could be useful to others on particularly weird Nvidia laptops.
Just wanted to chime in to say that I have that Nvidia powerd message on reboot too, but in my case the drivers load, so I think that can be safely disregarded.
which nvidia card are you running?
i think
nvidia-powerd
fails by design (thanks for nothing Nvidia, Inc) on older cards ... and by old I mean 10-series and 16-series... even though they work with the current driver...
i'm pretty sure 20-series and newer is where it should succeed
but yeah, regardless, the real test is if the drivers load... and nvidia-smi
is one of the simplest ways to validate thatWhat’s the trick to force an app to run on the descrete gpu on kde? I’ve been just checking to see if nvtop can see the card like it used to on silverblue-nvidia, but I honestly haven’t tried launching an app on the card since the install just cause I don’t know how
You need those kargs sorted first
No point even trying to use your gpu until they work
That would be a 1070, so that makes sense
terminal or try these:
https://github.com/ValveSoftware/steam-for-linux/issues/10026
https://github.com/ValveSoftware/steam-for-linux/issues/9383
https://github.com/ValveSoftware/steam-for-linux/issues/9940
but yeah i think you need the kargs sorted out
GitHub
Default
steam.desktop
has BAD options causing it to close and ope...Your system information Steam client version (build number or date): Jun 21 2023 21:17:38 Distribution (e.g. Ubuntu): Ubuntu 23.10 KDE Plasma Opted into Steam client beta?: No Have you checked for ...
GitHub
New steam UI does not open if run with DRI_PRIME=1 · Issue #9383 · ...
Your system information Steam client version (build number or date): 1682634349 Distribution (e.g. Ubuntu): Fedora Silverblue 38 Opted into Steam client beta?: [Yes/No] Yes Have you checked for sys...
GitHub
Remove PrefersNonDefaultGPU=true from .desktop file of Steam · Iss...
counter-part to #7089 PrefersNonDefaultGPU is broken by design, intended to mean "Use the Discrete GPU if possible" but instead generalized to mean "Use anything that we aren't u...
Thanks
Try updating to the latest version
Your issue may be solved
holy shit yall fixed it, thank you so much
i'm gonna do one more reinstall probably, just cause I want to play around with whether I have consistency with all this, but it is working right now! So happy rn
Happy to report that the Nvidia stuff continues to work after another reinstall. The bazzite setup portal didn’t auto start on the first boot, which was a little weird, but nothing else is misbehaving. (Other than all my usual issues, like reboots turning into shutdowns when the lid is closed and other dumb laptop nonsense) Thanks for all y’all’s hard work to make this a great user experience!
I'm still having this problem on the latest version of bazzite-nvidia desktop.
The kernal arguments are present and secure boot is set up.
Nvidia 1060 6GB
If you're just seeing a message in terminal with no other issues the latest update may correct that
I just ran rpm-ostree upgrade and rebooted.
I'm on 38.20230930
What are your symptoms?
The terminal tells me the Nvidia driver is not loaded
Nvtop also says that there is no GPU to monitor
Even though neofetch lists the GPU
lsmod | grep nvidia
Returns nothing
dmesg | grep nvidia | fpaste
I'm not sure what's going on here, this show's the correct kargs and that the driver is loading
Weirdo
Firmware Security in the Info Center says Secure Boot is enabled
Says Linux kernel is tainted 🤔
That's normal with NVIDIA
The license itself taints it
Ah
And your dmesg doesn't show the errors you'd get it it was being blocked by lack of key for secure boot
Probably a quirk of dual GPU?
Which is to say I'd check the arch wiki page on it and try some of their suggestions
Mainly the GPU switching stuff
This is a system with a Xeon, so no iGPU
Ah I missed that
This is REALLY weird then
I was in Hybrid mode, so I just switched to fully integrated
We'll see if that fixes it
Wait, no
It should be fully discrete
Wtf
Okay, back to hybrid
Says dGPU power active