andypotato
RRunPod
•Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
hey that's interesting. thanks for pointing that out. I wasn't even aware there is an option for "allowed cuda versions"
I can confirm this is now working as expected, thank you for that 🫶
I will further investigate this issue on my local system. It could be related to CUDA versions too. If that's the case, maybe a note in the docs or an error message to the user would be helpful. It is otherwise pretty much impossible to debug this issue as no logs will be generated on the worker console
10 replies
RRunPod
•Created by blue whale on 2/11/2025 in #⚡|serverless
Job stuck in queue and workers are sitting idle
This is probably the same issue as I have described in a separate report https://discord.com/channels/912829806415085598/1345960498478321735
- Jobs are simply never executed despite the worker running and even other workers being available
- Querying the job status will make it run immediately
- Only happens with jobs started via
run
but not runsync
- Exact same behavior in local testing environment48 replies
RRunPod
•Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
This is a serious issue because it completely breaks running workers async - I really hope you can look into this as soon as possible. If you need any support from my end with testing I am happy to provide.
10 replies
RRunPod
•Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
@yhlong00000 @Dj I have tried the same endpoint again and spawned a worker atjrfyd1c9zzgz - The result is exactly the same, the worker will start running and simply waste credits without ever executing rp_handler.py
10 replies
RRunPod
•Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
@yhlong00000 I don't think this is a single machine issue. If you read the observations that I shared in my issue report, this issue has occured on any worker, including on my own local machine when testing the container.
The only reason why I deployed this container to Runpod is to check if the issue can be reproduced on the cloud, and it can.
10 replies
RRunPod
•Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
Here is my endpoint ID:
os1z7gv7hgacgd
10 replies
RRunPod
•Created by andypotato on 2/22/2024 in #⚡|serverless
Returning error, but request has status "Completed"
Thank you Askley, it's working now 🙂
5 replies