Issue with a worker hanging at start
No code changes. Im also hitting an issue where workers are stuck with "loading image from cache"
21 Replies
dont see anything awry on docker hub
can you take a screenshot of your template / max / min workers?
if ur max is 1, i generally recommend it to be at least 2, usually helps clear it out
im also getting charged for this. Worker ID for the stuck job (active but not starting) : wb5t7bw3m6gwp4
Kill the job too?
im at 2/2
Oh interesting
I think just kill them all right now
Just wondering
is this a template that previously worked?
yeah
no changes
I see
ill kill it - wanted to keep it up uncase you guys wanted to debug
Yeah, just set it to 0, share the pod ID
i think is best for the staff to look at it, im just community member
interesting tho 😮
Pod ID in your description will be best for the staff to look at it when they wake up.
ah, thanks for helping though!
hey runpod staff, it's endpoint
3og40zffhp6irh
and worker ID wb5t7bw3m6gwp4
for more context, I ran this endpoint yesterday with 0 issuesWhich region is this? By the way, you are also running a pretty old version of the SDK, best to always upgrade to the latest SDK if you are having issues first to check whether that resolves your issues.
Is this SE region by any chance?
where can i tell what region? ill upgrade the python version to check
Its all regions if you're not using a network volume
upgraded to 1.5.1. worker booted fine, and then on a subsequent worker boot same issue where its stuck on this:
worker id: g1yx4w87tncbwb
@ashleyk ^
What kind of worker are you running?
i cant pull up what card it was using - when this happens again ill get a full dump