blabbercrab
blabbercrab
RRunPod
Created by blabbercrab on 7/7/2024 in #⚡|serverless
Trying to load a huge model into serverless
https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b Anyone have any idea how to do this in vLLM? I've deployed using two 80GB gpus and have had no luck
16 replies
RRunPod
Created by blabbercrab on 7/5/2024 in #⚡|serverless
Serverless is timing out before full load
I have a serverless endpoint which is loading a bunch of loras on top of sdxl, and the time it takes is a lot (more than 500 seconds) on the first load. This used to work well until I added even more loras, and now it's timing out and "removing container" and restarting it again and again Any tips to fix this?
39 replies
RRunPod
Created by blabbercrab on 5/10/2024 in #⛅|pods
NO gpu in community pods
No description
5 replies