R
RunPod4w ago
Blake

How to cache model download from HuggingFace - Tips?

Usin Serverless (48gb pro) w Flashboot. Want to optimize for fast cold start is there a guide somewhere? it does not seem to be caching the download - it's always re-downloading the model entirely (and slowly) should i ssh into some persistent storage & download the model there? then reference that local path in the HF model load?
No description
No description
9 Replies
nerdylive
nerdylive4w ago
Flashboot isn't some free storage like ssd, use network storage, it's mounted in /runpod-volume in serverless.. Or in pods /workspace
Blake
BlakeOP4w ago
@nerdylive would u recommend doing this (pic) ? (seems all workers in my endpoint will pull from this same /runpod-volume) - btw: perhaps runpod-volume is only available/mounted when when using a runpod docker base image? e.g. a ubuntu image doesn't seem to have it mounted (pic))
No description
No description
Blake
BlakeOP4w ago
also: it seems like when you change GPU type, the /runpod-volume is deleted/non accessible - is this correct?
nerdylive
nerdylive4w ago
No it's mounted when you run the worker in Runpod's server or system No, if you attach network storage it'll be persistent Along as you keep it and keep it attached to the endpoint you use
Blake
BlakeOP4w ago
okay thanks. do you recommend creating a new network volume? & persisting HF weights in that? perhaps that's more stable/clear for me to follow than using the default /runpod-volume (which i assume is attached by default?) but seems to be giving me unexpected behaviour ------ i seem to be triggering new HF downloads even when this image has run & downloaded & persisted the weights to /runpod-volume/.cache/huggingface/hub/.. in previous runs Downloading shards: 0%| | 0/3 [00:00<?, ?it/s]Request 48c80db3-d744-4f39-8af2-929133a77895: HEAD https://huggingface.co/LanguageBind/Video-LLaVA-7B if u happen to know / have a code example that shows a reliable way to persist HF in he most straightforward way lmk!
No description
No description
nerdylive
nerdylive4w ago
Just write and read to that path You can imagine it like a folder that is always there
Blake
BlakeOP3w ago
when writing to /rundpod-volume i'm still seeing the container do full model downloads when i kill the worker so i: - created a new network storage ( /modelstorage ) & and am read/writing to this - attached this volume to my endpoint (didn't deploy the volume) but when i kill the worker it re-downloads from hf??
No description
Blake
BlakeOP3w ago
any am i missing !? any code examples of ensuring it downloads from the network volume & NOT hf
nerdylive
nerdylive3w ago
No it acts as a drive.. Not re-downloads from network volume You just use the model from /runpod-volume Maybe your path/method is wrong, you need to cache your model there somehow Snapshot model or set the path of model there
Want results from more Discord servers?
Add your server