RunPod•14mo ago

IN-QUEUE Indefinitely

I am attempting to deploy a model from HF Spaces in runpod serverless - using the ByteDance/SDXL-Lightning Docker image. I started by selecting 'Run with Docker' for the ByteDance/SDXL-Lightning space on HF and copied the Docker image tag: registry.hf.space/bytedance-sdxl-lightning:latest. Next, in RunPod, I set up a serverless template by entering the Docker image tag into the 'Container Image' field and inputting 'bash -c "python app.py"' as the container start command. I allocated 50 GB of disk space to the container and finalized the template. Subsequently, I used this template to create an API endpoint in the 'Serverless' section. However, whenever I try to run the model, my requests remain indefinitely in the 'in-queue' state. Could you help identify what I might be doing wrong?

17 Replies

PatrickR•14mo ago

Do you have access to any of the Worker logs to see what's going on?

singhtanmay345OP•14mo ago

can't open worker logs

PatrickR•14mo ago

What happens when you click this, then Logs?

singhtanmay345OP•14mo ago

When I click on it, it tries launching logs for worker but then fails to do so. Basically shows nothing

haris•14mo ago

@singhtanmay345 are you able to see any workers running? Should look l ike that green box in the top left on Patrick's screenshot above, you may be getting unlucky and trying to send requests when all of our GPUs are already in use

J.•14mo ago

No that isnt the issue Is my guess Cause his workers are idle it looks like i think the bigger issue is the expectation that u can just run Hf model on runpod directly It doesnt sound like u have a proper handler.py file setup / a hugging face docker container prob doesnt have a runpod package installed Can read this for a basic understanding of how to do a python file for runpod serverless

J.•14mo ago

https://blog.runpod.io/serverless-create-a-basic-api/

RunPod Blog

Serverless | Create a Custom Basic API

RunPod's Serverless platform allows for the creation of API endpoints that automatically scale to meet demand. The tutorial guides you through creating a basic worker and turning it into an API endpoint on the RunPod serverless platform. For this tutorial, we will create an API endpoint that helps us accomplish

J.•14mo ago

@Merrell is it possible to update this blog to say the platform must be built for amd-64? actually a huge issue ppl run into

Data_Warrior•14mo ago

Facing the same issue

ashleyk•14mo ago

What issue? Are all your workers throttled? What do the worker logs say etc? You need to provide more detail.

J.•14mo ago

u say the same issue but the issue i pointed out was that u cant just point to any random docker image lol. u gotta prepare it to work the way runpod expects which is why i linked the article not sure if ur saying u made the same mistake? then read the article i linked

inventionsbyhamid•10mo ago

@singhtanmay345 were you able to run the model successfully? I also want to do the same, apparently need to setup the worker on runpod somehow which I am not sure how to.

PatrickR•10mo ago

I reread this thread, and yes Justin is correct. You can't just grab any random docker image, it needs to pass through the RunPod Handler to send in the correct requests. @inventionsbyhamid We don't have SDXL Ligtning as a dedicated worker, but we do have a tutorial on running SDXL turbo here: https://docs.runpod.io/tutorials/serverless/gpu/generate-sdxl-turbo#deploy-a-serverless-endpoint You don't need to build your Docker image for this, as we have prebuilt templates for SDXL turbo.

inventionsbyhamid•10mo ago

Managed to do it with help from a friend. I wanted to deploy SDXL Lightning (Bytedance), It's the most run image gen model on replicate. You guys should add it.

PatrickR•10mo ago

Do you have a repo I can checkout? I would be intrested in seeing what we can add to help support this.

inventionsbyhamid•10mo ago

Let me ask internally if we can open source it

Jason•10mo ago

maybe this is because an improper handler code, no worker took the job, so the job stays in the queue... I think its easy enough to make your own serverless code if its only for sdxl lightning try it out

Gaming

Programming

IN-QUEUE Indefinitely

Did you find this page helpful?