Xangelix
Xangelix
Explore posts from servers
RRunPod
Created by Xangelix on 4/11/2024 in #⚡|serverless
Questions on large LLM hosting
this didn't seem to lower it enough, is this message a typo? wouldn't I want to raise GPU_MEMORY_UTILIZATION if i'm getting OOM https://discord.com/channels/912829806415085598/1211740161948524564/1212674202465869864
8 replies
RRunPod
Created by Xangelix on 4/11/2024 in #⚡|serverless
Questions on large LLM hosting
could this be fixed with ENFORCE_EAGER=1 ?
8 replies
RRunPod
Created by Xangelix on 4/11/2024 in #⚡|serverless
Questions on large LLM hosting
No description
8 replies
RRunPod
Created by Xangelix on 4/11/2024 in #⚡|serverless
Questions on large LLM hosting
4 an extra one How much VRAM does VLLM typically use outside of the weights? I'm testing a model now that only uses 38GB in weights, but I'm getting OOM on 48GB gpus...
8 replies