R
RunPod6d ago
Lattus

Serverless deepseek-ai/DeepSeek-R1 setup?

How can I configure a serverless end point for deepseek-ai/DeepSeek-R1?
18 Replies
nerdylive
nerdylive6d ago
does vllm supports that model? if not, you can make a model that can run inference for that model
Lattus
LattusOP6d ago
Basic config, 2 GPU count
No description
No description
Lattus
LattusOP6d ago
Once it is running, I try the default hello world request and it just gets stuck IN_QUEUE for 8 minutes..
nerdylive
nerdylive6d ago
Can you check logs maybe its still downloading or OOM wait.. how big is the model? seems like r1 is a really huge model isnt it?
Lattus
LattusOP6d ago
yes, but I tried even just following along with the youtube tutorial here and got the same IN_QUEUE problem...: https://youtu.be/0XXKK82LwWk?si=ZDCu_YV39Eb5Fn8A
RunPod
YouTube
Set Up A Serverless LLM Endpoint Using vLLM In Six Minutes on RunPod
Guide to setting up a serverless endpoint on RunPod in six minutes on RunPod.
nerdylive
nerdylive6d ago
Any logs? in your workers or endpoint?
Lattus
LattusOP6d ago
Oh, wait!! I just ran the 1.5B model and got this response:
No description
Lattus
LattusOP6d ago
When I tried running the larger model, I got errors about not enough memory ""Uncaught exception | <class 'torch.OutOfMemoryError'>; CUDA out of memory. Tried to allocate 3.50 GiB. GPU 0 has a total capacity of 44.45 GiB of which 1.42 GiB is free"
nerdylive
nerdylive6d ago
seems like you got oom ya..
Lattus
LattusOP6d ago
So how do I configure ?
nerdylive
nerdylive6d ago
r1 is such a huge model seems like you need 1tb+ vram don't know how to calculate, but est maybe something in range of 700gb+ vram
Lattus
LattusOP6d ago
wow so it's not really an option to deploy?..
nerdylive
nerdylive6d ago
not sure, depends for your use hahah
Lattus
LattusOP6d ago
I mean, Deepseek offers their own API keys I thought it could be more cost effective to just run a serverless endpoint here but..
nerdylive
nerdylive6d ago
only if you got enough volume, especially for bigger models imo
Lattus
LattusOP6d ago
hmm.. I see Thanks for your help
nerdylive
nerdylive6d ago
your welcome bro

Did you find this page helpful?