R
RunPod•11mo ago
Ryan Witzman

Custom Template Taking Hours To Initialize

I made a custom template with the docker template https://huggingface.co/spaces/rwitz/go-bruins-v2 and it is taking hours to initialize on serverless.
25 Replies
justin
justin•11mo ago
Wait I am confused what you are trying to do? You are trying to initialize this on serverless? or is this a docker building?
justin
justin•11mo ago
Im just confused cause this by itself already has issues
No description
justin
justin•11mo ago
@Ryan Witzman (rwitz) What is your min / max workers there should be workers do u have: 1) your runpod template that you used to start your serverless 2) what is this? 3) do you have your docker file
Ryan Witzman
Ryan WitzmanOP•11mo ago
min workers is 0 and max is 1 yeah its on huggingface yes
FROM runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel

RUN pip install runpod transformers

RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

RUN python -c 'from transformers import pipeline; import torch; pipe = pipeline("text-generation", model="rwitz/go-bruins-v2",device=0,torch_dtype=torch.bfloat16)'

ADD handler.py .

CMD [ "python", "-u", "/handler.py" ]
FROM runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel

RUN pip install runpod transformers

RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

RUN python -c 'from transformers import pipeline; import torch; pipe = pipeline("text-generation", model="rwitz/go-bruins-v2",device=0,torch_dtype=torch.bfloat16)'

ADD handler.py .

CMD [ "python", "-u", "/handler.py" ]
justin
justin•11mo ago
Can I see your docker file? And also... Im not sure if hugging face holds docker images? I could be wrong For #1, there is something on runpod you used to create it
justin
justin•11mo ago
No description
justin
justin•11mo ago
what i mean is this
Ryan Witzman
Ryan WitzmanOP•11mo ago
oh
Ryan Witzman
Ryan WitzmanOP•11mo ago
No description
justin
justin•11mo ago
Ah interesting I learnt something new Okay, so just as an FYI: I recommend trying to build a template using: FROM runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel RUN pip install runpod transformers RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 RUN python -c 'from transformers import pipeline; import torch; pipe = pipeline("text-generation", model="rwitz/go-bruins-v2",device=0,torch_dtype=torch.bfloat16)' ADD handler.py . And don't do the run command, and test it using GPU Pod 🙂 it always is a great place to test. 2) https://huggingface.co/spaces/rwitz/go-bruins-v2 Your build failed So im guessing that your serverless function is trying to find something that doesn't exist
Ryan Witzman
Ryan WitzmanOP•11mo ago
yeah because the huggingface doesnt have cuda the serverless should have cuda
justin
justin•11mo ago
Serverless does have cuda, but if your docker build failed, it won't produce an image for serverless to use that means in your docker build step, you gotta maybe default to using CPU to load the model vs a torch cuda so that the image builds successfully do you see the :latest tag somewhere tho? just curious, never used hugging face myself to host an image registry What is happening is when you run the build command: 1. It sets up an isolated repeatable environment 2. If it fails to set up that isolated environment it won't produce an "image" for other environment to start off of So #2 is what is happening. 3. Gotta download the models without using like torch.cuda etc.
Ryan Witzman
Ryan WitzmanOP•11mo ago
Ok so I will have the docker download the model instead of instantiated it yes
justin
justin•11mo ago
can prob just use a curl request or somethign to the models directly Also, ive never used hugging face as a docker registry before, but just make it public? i guess haha if not already. im guessing there might be issues if its a private repository, but ive never used it before i usually just use dockerhub gl gl! 🙂
Ryan Witzman
Ryan WitzmanOP•11mo ago
yes
justin
justin•11mo ago
I also recommend again:
FROM runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel

RUN pip install runpod transformers

RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

RUN python -c 'from transformers import pipeline; import torch; pipe = pipeline("text-generation", model="rwitz/go-bruins-v2",device=0,torch_dtype=torch.bfloat16)'

ADD handler.py .
FROM runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel

RUN pip install runpod transformers

RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

RUN python -c 'from transformers import pipeline; import torch; pipe = pipeline("text-generation", model="rwitz/go-bruins-v2",device=0,torch_dtype=torch.bfloat16)'

ADD handler.py .
if you end up wanting to test it on a GPU Pod, vs serverless, i always find that an easier validation step then in your serverless, you can do as you did, where you overrode the CMD command or you can have a second docker file that does:
FROM MYGPU POD IMAGE

ADD handler.py .

CMD handler.py all that
FROM MYGPU POD IMAGE

ADD handler.py .

CMD handler.py all that
Which would make your future iterations of handler.py super fast Since all you got is now a base image with all the models - not needing to rebuild them every time, and all u gotta do is add your new handler.py and stuff
Ryan Witzman
Ryan WitzmanOP•11mo ago
ah i see
justin
justin•11mo ago
Yup yup~ is a new thing i learnt last week haha, cause i have an audio sound effect stuff, that I was playing with some big models and i was getting painful to keep downloading the model on every iteration haha so small fun tip 😄
Ryan Witzman
Ryan WitzmanOP•11mo ago
@justin it says it's running but when i look at the logs it is still downloading the image and at a terribly slow speed
No description
justin
justin•11mo ago
Just as a side thing, try to set your max workers to three, 🙂 you won't pay for additional workers unless they are actively being utilized Second thing is I can't say anythign about download speed tbh Sometimes I find that it is slow too, but I am not too sure as to what causes it. Usually other than one instance, been pretty fast for me, but i usually use dockerhub It also depends how big your model is this doesn't seem too big of a model tho, so maybe should take about 5-10 minutes
Ryan Witzman
Ryan WitzmanOP•11mo ago
its 14gb
justin
justin•11mo ago
is my experience yea
Ryan Witzman
Ryan WitzmanOP•11mo ago
i will set to three though
justin
justin•11mo ago
yeah sometimes i find that setting it to three, u sometimes get a bit better worker who downloads faster i find it works a bit wonky with one
Ryan Witzman
Ryan WitzmanOP•11mo ago
ah that made it so much quicker thanks
Want results from more Discord servers?
Add your server