Stephen
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
thx
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
OK
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
how to report to runpod?
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
This is going to take up more of our time, and we are short-staffed. I just want to know if Runpod has technical personnel who can help us troubleshoot this issue. We have checked the code logic and found no issues.
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
GPU utilization fluctuates wildly, sometimes even dropping to zero, and we have nothing changed!
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
yes
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
inference speed is extremely low
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
but gpu kernels are not running at all
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
yeah
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
Despite all other conditions remaining unchanged, sometimes the inference speed is fast, and at other times it is very slow, even though the model has already been loaded into the GPU memory.
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
During the inference process, we received feedback from users that the inference speed was particularly slow. Upon checking, we confirmed that the issue was indeed related to the inference, but the GPU utilization was either zero or very low.
28 replies
RRunPod
•Created by Stephen on 7/11/2024 in #⛅|pods
🆘 We've encountered a serious issue with the machines running in our production environment
The reason we're using SOS is because we've encountered this issue in a production environment, which directly affects the user experience, but I don't know who to turn to for help.
28 replies