Runpod queue not processing
Hey,
using Kandinsky 2.1 deployed serverless application. Then hit the run endpoint and it was queued. checked status by id still in_queue status.
anyone can resolve this issue ?
78 Replies
Works for me, did you deploy it yourself or using the RunPod managed one?
He is using 3.0, not 2.1... deployed by himself
You guys work together?
yes
Are all your workers throttled?
look
Why do you only have 1 worker? You should leave it at the default of 3
The 1 worker is throttled so thats why your requests are stuck in the queue.
It can only accept requests from the queue when it shows as
Ready
like this.Changed here to 3
maybe the problem is the worker code...
Are you using network storage?
Its not a problem with the worker code, all your workers are throttled again.
Try change to a different GPU tier instead of 24GB, try 24GB PRO or 48GB, but you need to scale your workers down to zero and back up again for it to take effect
Look bro
Is that your own image or someone else's image?
Looks the same name as yours so I assume its yours?
If you made the repository private on Dockerhub, you need to add the registry credentials to RunPod as well as to your endpoint template.
Then select the credentials here in your serverless template.
Then scale max workers down to zero and back again for the changes to take effect.
it is my image and repo is currently public
This AMD64 says: Image may have poor performance, or fail, if run via emulation
Didn't you build it for Linux? How did you build it?
my computer is macOS
Also don't use
latest
tag, that is bad practice for serverless. Its fine for Pods, but not for Serverless. For Severless you should use version tags and releases.sure, thanks
Basically use buildx and
--platform linux/amd64
.i should put the tag like: version1 ?
bro, do you know if can do git clone before that?
I used these commands when I created the image:
git clone https://github.com/rafaelvmfranco/repo
cd repo
docker build . -t rafaelfranco21/kandinsky:latest
docker push rafaelfranco21/kandinsky:latest
Oh yeah, your docker command is wrong for Mac
Something like that should work.
The git command is not really relevant, just the docker build command.
I got this error: ERROR: failed to solve: process "/bin/bash -c python /cache_models.py" did not complete successfully: exit code: 1
Your disk is out of space
my computer or docker?
Your computer probably
š
Should show, or you can also check in Finder
Bottom one is full
actually I don't understand because it says size is 0Bi
But google docker prune commands, you can probably prune to free up space
I will delete some apps
Looks like the / disk is fine though
Maybe you buildx has a size limit or something, I don't know
crazy thing!
Im not even a developer
hahah
Me neither, at least not anymore
what are you now?
I have zero code knowledge... im just trying like crazy with your help
By the way you can also try using https://depot.dev to build your image, they used to have a free plan but apparently its been removed and you have to pay now, but may be worth paying to save yourself a lot of headaches and time, as they say "time is money".
They probably had to make it paid because of the large amount of RunPod people who started using it š
I dont think I would know how to use it
but thanks!
Bro, I created the image!
Already created the new runpod endpoint using the new image
now, I will see if it gets the ready status
still initializing
too good to be true š
Click on one to view the logs
there is an error
What is the image set to in your template?
Does it have a tag?
this
rafaelfranco21/kandinsky3.0:1:0:0
maybe it is because the docker repo is private
but I already created the credentials like you said
mistake is here
Yeah should probably be dots not colons
yes
now I got another error
2024-03-01T18:16:35Z error pulling image: Error response from daemon: Head "https://registry-1.docker.io/v2/rafaelfranco21/kandinsky3.0/manifests/1.0.0": unauthorized: incorrect username or password
Looks like your registry credentials are incorrect, double check them or make your image public
when I send docker login
in the terminal its says:
Login Succeeded
š
Yeah but thats a username and password probably, you should use an access token for pulling your images and make it read-only: https://hub.docker.com/settings/security
Like this
There is one of this here
so I need to craete one only for runpod
I generated it
what do I do with it bro?
Edit your docker auth credentials on RunPod and put it in there
username is the docker username
and pssword is the token that I just generated
Yes
Ok, thanks
I will now scale workers to zero
to reset it
now it says: 2024-03-01T18:41:52Z 14ef900132c4 Downloading [=============================================> ] 18.45GB/20.07GB
several lines of this
so it seems that it is working
Yeah its pulling the image
these extra workers are ready
the latest workers are not
What do they say when you click on them?
Now, all say worker is ready
but status is initializing
now, ready
š š
Yeah it takes a while for all the workers to become ready the first time and when you do a new release.
Thanks š š
The explanations from "Container Registry Auth" should include we need to generate a "Read-only" docker token š