Incredibly long queue for CPU Compute on Toy Example
I am trying to run the code in this blog post, https://blog.runpod.io/serverless-create-a-basic-api/. The wait time for this simple function to execute has been above 5 minutes. There are two items in the queue that are not moving. There are no error in the logs. It appears stuck in the "Initializing" state with no workers spinning up. How can I fix this?
Also when I tried to create the endpoint, the UI would not allow me to select the template I created earlier.
RunPod Blog
Serverless | Create a Custom Basic API
RunPod's Serverless platform allows for the creation of API endpoints that automatically scale to meet demand. The tutorial guides you through creating a basic worker and turning it into an API endpoint on the RunPod serverless platform. For this tutorial, we will create an API endpoint that helps us accomplish
36 Replies
It's still initializing maybe still pulling the image from reigstry
Click one of the workers on your endpoint then there will be logs
Are you building on mac?
Just wondering
Not allowing? Any errors?
Sometimes people build on mac, and that blog badly doesn't tell that you need to have a:
--platform amd64 flag when building your image
Then runpod for some reason? still pulls that image, and nothing can occur / it can get caught in a loop
Ahh that will make an error on runpod?
Wow in a running state?
I am building on a mac
@Merrell / @Papa Madiator Is it possible to update that blog either way with the note / warning of the --platform flag. Whether or not this is an issue with this problem, it is something that occurs all the time
Yeah, just kill your current endpoint and add:
--platform linux/amd64
https://docs.docker.com/build/building/multi-platform/#:~:text=The%20%2D%2Dplatform%20flag%20informs,the%20images%20to%20Docker%20Hub.
Not sure if you need builderx or if it just built into the base docker tooling now
but u can build for cross platforms
There are no workers to click on
Ah, i mean, set ur max workers to 0
and delete it
It shows one that was throttled
And then when u make a new one in the future, set the max worker to 3 (runpod acts weird when the max workers is 1 i find), and then set it to that new dockerhub image with the right platform flag
Yup! Just set max workers to 0, delete the endpoint
rebuild ur image
repush it
and then try again 🙂
ok will try that now. hold up
yup! In the future, just make sure that you always have that --platform flag
Personally the workflow that I do b/c of that bc im also on mac:
1) I spin up a GPU Pod with pytorch template
2) I manaully go through installation step on the GPU pod and write it down
3) I try to then translate that to a dockerfile, build the image, and then try to test that on my GPU Pod as a new template pointed to my new docker build. If it's a serverless thing, u can just get rid of the runpod.start() on the GPU Pod when u open the file, and manually call the handler function for testing.
A more advance example repo that I built using the above methodology for future reference:
https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless
adding the
--platform linux/amd64
causes the docker build to spin endlessly
ah nevermind, just succeeded, but was slow AFxD. the cold start can be a pain in the butt. but after the first download of the docker images
it should be cached on runpod's end
maybe try to send a few more requests and see what happens
also usually after the first request is done, the worker has a bit of idle time you can adjust before it shuts down
to keep processing requests
so it can be a bit of a balancing act of, first request gets hit with cold start, all subsequent requests, the computer is on, and can handle fast
this instantly disappears as soon as I finish typing. This is on the endpoints page.
That's unusual, did you add your template yet
Maybe refreshing the page works
I think that is a dropdown?
not sure why it would disappear tho
maybe make a template separately first, then use the dropdown button instead?
might be a ui bug
I did that and it didn't work. The endpoint does work now though after setting max workers to 3
hm weird. feel free to record and file a bug if needed.
I see the tempalte available for a pod but not for the serverless endpoint
Ohh maybe you created it for pods?
When I tried to edit it, it wouldn't let me change it. I'll try again
btw with the serverless endpoints, if I set the max workers to 3 but the minimum to 0, will I pay if the workers are idle?
Hmm maybe just create a new one for serverless
Minimum? Active workers?
max to 3, minimum to 0, u don't pay anything
unless they go active and are working
(which is when they are green)
No as long as active workers dont exists, because they are running all the time until you turn it off
when they are grey, u aren't paying for them
they are just sitting there ready to pick up
Yeh lower cold boot times
Usually, runpod, will add on a couple more grey workers, even if your max is 3, incase that some the workers get throttled (meaning the GPU is being used by someone else), but they will always respect your max number, and won't ever exceed it
What's it's for? Why it's cycled sometimes ( the extra workers )
Oh for extra safety incase of throttle
Nvm then
Throttling used to be insanely worst, where even with the extra workers, all of it could get timed out. but i think they fixed... (?) maybe, a couple months back. it caused a pretty bad issue for a week
cause almost all the avalaibility was sucked up by a bigger player
but flash fixed it... or at least made it a lot better
https://discord.com/channels/912829806415085598/1210229810668765234
Yep yep I felt that b4
Unknown channel to me
MIght just be discord lagging, but it's just the #📌|roadmap roadmap channel
thread called: serverless allocation optimization
if curious
I got the template to work FYI. You can't change a pod to a servelerss template once its created, that was the issue