blabbercrab Posts - Answer Overflow

blabbercrab

•Created by blabbercrab on 7/7/2024 in #⚡｜serverless

Trying to load a huge model into serverless

https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b Anyone have any idea how to do this in vLLM? I've deployed using two 80GB gpus and have had no luck

16 replies

RRunPod

•Created by blabbercrab on 7/5/2024 in #⚡｜serverless

Serverless is timing out before full load

I have a serverless endpoint which is loading a bunch of loras on top of sdxl, and the time it takes is a lot (more than 500 seconds) on the first load. This used to work well until I added even more loras, and now it's timing out and "removing container" and restarting it again and again Any tips to fix this?

39 replies

RRunPod

•Created by blabbercrab on 5/10/2024 in #⛅｜pods

NO gpu in community pods

5 replies

Gaming

Programming