GPU pod's performance is inconsistent

I am using a pod (RTX 4090 with 100GB network-volume) to generate image. As expected, a task need around 5-6s to finish. sometime performance drop to 30s/task. Can anyone explain what's going on to me? Thank you so much
No description
No description
6 Replies
digigoblin
digigoblin3w ago
Run nvidia-smi to check whether the host has enabled power cap
LisT_99
LisT_993w ago
watch nvidia-smi when inference task is running, power usage is around 40-70w/450w
digigoblin
digigoblin3w ago
450W means its not power capped
LisT_99
LisT_993w ago
i dont know, but I running a same task (everything is the same), if power usage is 70w/450w, it takes 30s; and if power usage is 200w/450w, it takes 5s. why it's so inconsistent? how can i configure it to make it more stable?
nerdylive
nerdylive3w ago
Im not sure what's causing its unstable
digigoblin
digigoblin3w ago
Probably some bug in the application