R
RunPod•7mo ago
garnizzle

Incredibly long queue for CPU Compute on Toy Example

I am trying to run the code in this blog post, https://blog.runpod.io/serverless-create-a-basic-api/. The wait time for this simple function to execute has been above 5 minutes. There are two items in the queue that are not moving. There are no error in the logs. It appears stuck in the "Initializing" state with no workers spinning up. How can I fix this? Also when I tried to create the endpoint, the UI would not allow me to select the template I created earlier.
RunPod Blog
Serverless | Create a Custom Basic API
RunPod's Serverless platform allows for the creation of API endpoints that automatically scale to meet demand. The tutorial guides you through creating a basic worker and turning it into an API endpoint on the RunPod serverless platform. For this tutorial, we will create an API endpoint that helps us accomplish
36 Replies
nerdylive
nerdylive•7mo ago
It's still initializing maybe still pulling the image from reigstry Click one of the workers on your endpoint then there will be logs
justin
justin•7mo ago
Are you building on mac? Just wondering
nerdylive
nerdylive•7mo ago
Not allowing? Any errors?
justin
justin•7mo ago
Sometimes people build on mac, and that blog badly doesn't tell that you need to have a: --platform amd64 flag when building your image Then runpod for some reason? still pulls that image, and nothing can occur / it can get caught in a loop
nerdylive
nerdylive•7mo ago
Ahh that will make an error on runpod? Wow in a running state?
garnizzle
garnizzleOP•7mo ago
I am building on a mac
justin
justin•7mo ago
@Merrell / @Papa Madiator Is it possible to update that blog either way with the note / warning of the --platform flag. Whether or not this is an issue with this problem, it is something that occurs all the time Yeah, just kill your current endpoint and add: --platform linux/amd64 https://docs.docker.com/build/building/multi-platform/#:~:text=The%20%2D%2Dplatform%20flag%20informs,the%20images%20to%20Docker%20Hub. Not sure if you need builderx or if it just built into the base docker tooling now but u can build for cross platforms
garnizzle
garnizzleOP•7mo ago
There are no workers to click on
justin
justin•7mo ago
Ah, i mean, set ur max workers to 0 and delete it
garnizzle
garnizzleOP•7mo ago
It shows one that was throttled
justin
justin•7mo ago
And then when u make a new one in the future, set the max worker to 3 (runpod acts weird when the max workers is 1 i find), and then set it to that new dockerhub image with the right platform flag
garnizzle
garnizzleOP•7mo ago
No description
justin
justin•7mo ago
Yup! Just set max workers to 0, delete the endpoint rebuild ur image repush it and then try again 🙂
garnizzle
garnizzleOP•7mo ago
ok will try that now. hold up
justin
justin•7mo ago
yup! In the future, just make sure that you always have that --platform flag Personally the workflow that I do b/c of that bc im also on mac: 1) I spin up a GPU Pod with pytorch template 2) I manaully go through installation step on the GPU pod and write it down 3) I try to then translate that to a dockerfile, build the image, and then try to test that on my GPU Pod as a new template pointed to my new docker build. If it's a serverless thing, u can just get rid of the runpod.start() on the GPU Pod when u open the file, and manually call the handler function for testing. A more advance example repo that I built using the above methodology for future reference: https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless
garnizzle
garnizzleOP•7mo ago
adding the --platform linux/amd64 causes the docker build to spin endlessly ah nevermind, just succeeded, but was slow AF
justin
justin•7mo ago
xD. the cold start can be a pain in the butt. but after the first download of the docker images it should be cached on runpod's end maybe try to send a few more requests and see what happens also usually after the first request is done, the worker has a bit of idle time you can adjust before it shuts down to keep processing requests so it can be a bit of a balancing act of, first request gets hit with cold start, all subsequent requests, the computer is on, and can handle fast
garnizzle
garnizzleOP•7mo ago
this instantly disappears as soon as I finish typing. This is on the endpoints page.
No description
nerdylive
nerdylive•7mo ago
That's unusual, did you add your template yet Maybe refreshing the page works
justin
justin•7mo ago
I think that is a dropdown? not sure why it would disappear tho maybe make a template separately first, then use the dropdown button instead? might be a ui bug
garnizzle
garnizzleOP•7mo ago
I did that and it didn't work. The endpoint does work now though after setting max workers to 3
justin
justin•7mo ago
hm weird. feel free to record and file a bug if needed.
garnizzle
garnizzleOP•7mo ago
I see the tempalte available for a pod but not for the serverless endpoint
nerdylive
nerdylive•7mo ago
Ohh maybe you created it for pods?
garnizzle
garnizzleOP•7mo ago
When I tried to edit it, it wouldn't let me change it. I'll try again btw with the serverless endpoints, if I set the max workers to 3 but the minimum to 0, will I pay if the workers are idle?
nerdylive
nerdylive•7mo ago
Hmm maybe just create a new one for serverless Minimum? Active workers?
justin
justin•7mo ago
max to 3, minimum to 0, u don't pay anything unless they go active and are working (which is when they are green)
nerdylive
nerdylive•7mo ago
No as long as active workers dont exists, because they are running all the time until you turn it off
justin
justin•7mo ago
when they are grey, u aren't paying for them they are just sitting there ready to pick up
nerdylive
nerdylive•7mo ago
Yeh lower cold boot times
justin
justin•7mo ago
Usually, runpod, will add on a couple more grey workers, even if your max is 3, incase that some the workers get throttled (meaning the GPU is being used by someone else), but they will always respect your max number, and won't ever exceed it
nerdylive
nerdylive•7mo ago
What's it's for? Why it's cycled sometimes ( the extra workers ) Oh for extra safety incase of throttle Nvm then
justin
justin•7mo ago
Throttling used to be insanely worst, where even with the extra workers, all of it could get timed out. but i think they fixed... (?) maybe, a couple months back. it caused a pretty bad issue for a week cause almost all the avalaibility was sucked up by a bigger player but flash fixed it... or at least made it a lot better https://discord.com/channels/912829806415085598/1210229810668765234
nerdylive
nerdylive•7mo ago
Yep yep I felt that b4 Unknown channel to me
justin
justin•7mo ago
MIght just be discord lagging, but it's just the #📌|roadmap roadmap channel thread called: serverless allocation optimization if curious
garnizzle
garnizzleOP•7mo ago
I got the template to work FYI. You can't change a pod to a servelerss template once its created, that was the issue
Want results from more Discord servers?
Add your server