StardustXR Server causes Monado to segfault immediately

Not sure where the problem lies, but xrgears runs fine. I'm running the latest garuda linux on basically a fresh install, and I've tested this on xfce, and kde/wayland flavors. Both behave the same. This is with the newest monado v24.0.0 and latest stardust. This is inside a vfio VM, but I've tested steamvr inside it, so I suspect that's not relevant. Should I be using an earlier version of monado? I tried getting v21, but it was far too old to compile easily on garuda.
26 Replies
Nova
Nova6mo ago
what stardust server version are you using? is it main or dev branch?
Lost
LostOP6mo ago
Ah, the main branch. Let me try dev
Nova
Nova6mo ago
dev is... very broken :p anyway, this is so weirdd can you run monado using a debugger? envision has one
Lost
LostOP6mo ago
Haha, okay, I won't try dev. I can grab envision. How would I open the debugger? Or could you point me to some docs?
Nova
Nova6mo ago
oh if you don't use envision then just run gdb on it
Lost
LostOP6mo ago
[New Thread 0x7fffc29f36c0 (LWP 13293)]
INFO [client_loop] Client 1 connected
INFO [ipc_handle_instance_describe_client] Client info:
id: 1
application_name: 'Stardust XR'
pid: 13280
extensions:
ext_hand_tracking_enabled: true
ext_eye_gaze_interaction_enabled: true
ext_hand_interaction_enabled: false
[New Thread 0x7fffc21f26c0 (LWP 13294)]

Thread 16 "monado-service" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc21f26c0 (LWP 13294)]
0x00007fffe6a07a87 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
(gdb) backtrace
#0 0x00007fffe6a07a87 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#1 0x00007fffe6e22d99 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#2 0x00007fffe6e07790 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#3 0x00007fffe6d237ee in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#4 0x00005555556b73d3 in fence_wait (xcf=<optimized out>, timeout=<optimized out>) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/util/comp_sync.c:64
#5 0x00005555556c841f in xrt_compositor_fence_wait (xcf=0x7fffb4029d80, timeout=100000000) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/include/xrt/xrt_compositor.h:753
#6 wait_fence (xcf_ptr=<optimized out>) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/multi/comp_multi_compositor.c:132
#7 run_func (ptr=0x7fffb4000bf0) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/multi/comp_multi_compositor.c:291
#8 0x00007ffff38a6ded in start_thread (arg=<optimized out>) at pthread_create.c:447
#9 0x00007ffff392a0dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
(gdb)
[New Thread 0x7fffc29f36c0 (LWP 13293)]
INFO [client_loop] Client 1 connected
INFO [ipc_handle_instance_describe_client] Client info:
id: 1
application_name: 'Stardust XR'
pid: 13280
extensions:
ext_hand_tracking_enabled: true
ext_eye_gaze_interaction_enabled: true
ext_hand_interaction_enabled: false
[New Thread 0x7fffc21f26c0 (LWP 13294)]

Thread 16 "monado-service" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc21f26c0 (LWP 13294)]
0x00007fffe6a07a87 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
(gdb) backtrace
#0 0x00007fffe6a07a87 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#1 0x00007fffe6e22d99 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#2 0x00007fffe6e07790 in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#3 0x00007fffe6d237ee in ?? () from /usr/lib/libnvidia-glcore.so.550.90.07
#4 0x00005555556b73d3 in fence_wait (xcf=<optimized out>, timeout=<optimized out>) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/util/comp_sync.c:64
#5 0x00005555556c841f in xrt_compositor_fence_wait (xcf=0x7fffb4029d80, timeout=100000000) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/include/xrt/xrt_compositor.h:753
#6 wait_fence (xcf_ptr=<optimized out>) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/multi/comp_multi_compositor.c:132
#7 run_func (ptr=0x7fffb4000bf0) at /usr/src/debug/monado/monado-v24.0.0/src/xrt/compositor/multi/comp_multi_compositor.c:291
#8 0x00007ffff38a6ded in start_thread (arg=<optimized out>) at pthread_create.c:447
#9 0x00007ffff392a0dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
(gdb)
Let me know if you need more and thanks for the help!
Nova
Nova6mo ago
ugh nvidia uhhh, so i think this is a monado problem https://discord.gg/dTdPVDVN you should probably ask here since it's on the service side stardust shouldn't be able to cause a crash regardless of how broken it might be
Lost
LostOP6mo ago
Fair enough! I'll bring it up there. Thanks again
Nova
Nova6mo ago
no problem
Schmarni
Schmarni6mo ago
this looks like the same issue i was having btw, it's something in stereokit that trigger the issue
Nova
Nova6mo ago
hmm well if it's stereokit, maybe the dev branch on the server will help you then i updated it there but don't expect any clients to work yet (except maybe flatland on dev branch), just the hands
Schmarni
Schmarni6mo ago
i don't think it will sine even building stereokit. has the same issue
Nova
Nova6mo ago
wait building stereokit has an issue?
Schmarni
Schmarni6mo ago
ye but i meant after I tested stardust, a clean stereokit-rs example, a stereokitC app with precompiled libs and the StereokitC test app completely compiled on my machine, they all crash the runtime the same way and i have had no problems like this with any other vr app
Nova
Nova6mo ago
huh that's very strange
Schmarni
Schmarni5mo ago
i should probably just bite the bullet and make a github issue on stereokit since i have been trying to fix this issue on and off for months and trying to debug a codebase i don't know in a language i don't know doesn't seem like the best use of my time
Nova
Nova5mo ago
i wish i could help but i'm so out of practice with that codebase you could ask in the stereokit server tho
Schmarni
Schmarni5mo ago
made an issue here https://github.com/StereoKit/StereoKit/issues/1033 i made a "fix" for this, it's a problem with the nvidia drivers not being able to cope with the use of a fence. i had to patch monado
Nova
Nova5mo ago
oh thank!!!!
Schmarni
Schmarni5mo ago
in my testing this didn't break anything but i didn't test it that mutch yet
Skull
Skull5mo ago
Awesome! Huge thanks Think stereokit will take this upstream?
Schmarni
Schmarni5mo ago
this is a monado patch?
Skull
Skull5mo ago
Oh! Are you able to add an nvidia vendor check and PR monado? This would solve a giant class of issues for a large group of users currently looking to sample the wares
Nova
Nova5mo ago
ah hi
Skull
Skull5mo ago
It will apparently cause graphical artifacts if you use this patch "Look for insert_fence_func in comp_egl_client.c, set it to NULL." Is what I was told to relay
Schmarni
Schmarni5mo ago
yeah i was expecting issues with it, it was just an extremely hacky solution that worked on my machine. for a upstream solution i'll look into a better solution
Want results from more Discord servers?
Add your server