Xangelix Comments - Answer Overflow

Topics

Xangelix

Explore posts from servers

CDCloudflare Developers RRunPod

•Created by Xangelix on 4/11/2024 in #⚡｜serverless

Questions on large LLM hosting

this didn't seem to lower it enough, is this message a typo? wouldn't I want to raise GPU_MEMORY_UTILIZATION if i'm getting OOM https://discord.com/channels/912829806415085598/1211740161948524564/1212674202465869864

8 replies

•Created by Xangelix on 4/11/2024 in #⚡｜serverless

Questions on large LLM hosting

could this be fixed with ENFORCE_EAGER=1 ?

8 replies

•Created by Xangelix on 4/11/2024 in #⚡｜serverless

Questions on large LLM hosting

No description

8 replies

•Created by Xangelix on 4/11/2024 in #⚡｜serverless

Questions on large LLM hosting

4 an extra one How much VRAM does VLLM typically use outside of the weights? I'm testing a model now that only uses 38GB in weights, but I'm getting OOM on 48GB gpus...

8 replies