bp
RRunPod
•Created by bp on 12/12/2024 in #⚡|serverless
Using runpod serverless for HF 72b Qwen model --> seeking help
Hey all, I'm new to this and tried loading a HF Qwen 2.5 72b variant on Runpod serverless, and I'm having issues.
Requesting help from runpod veterans please!
Here's what i did:
Clicked into runpod serverless
pasted the HF link for modell https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
Chose A100 (80gb) and 2GPUs (choosing 1 GPU gave me an error message)
Added MAX_MODEL_LENGTH setting of 20k tokens (previously had an error message as I didn't set this initially, which was busted by the 128k default model context)
Clicked deploy
Clicked run ("hello world prompt")
It then started loading . Took about half and hour, to download, went through all the checkpoints and eventually just had a bunch of error messages, and the pod just kept running. Ate up $10 of credits.
LOG output was somethhing like attached.
It just kept running and eating credits, and wouldnt respond to any requests (would always just be in queue) so i shut it down.
I tried googling / youtube for tutorials, but haven't found much.
Anyone can point me in the right direction to get this going please?
Thanks!
3 replies