R
RunPod3mo ago
Thibaud

vllm seems not use GPU

i'm using vllm and on the graph, when i launch some request, only cpu usage increase. if i open a terminal and launch nvidia-smi, i didn't see any process too. settings line --model NousResearch/Meta-Llama-3-8B-Instruct --max-model-len 8192 --port 8000 --dtype half --enable-chunked-prefill true --max-num-batched-tokens 6144 --gpu-memory-utilization 0.97
No description
11 Replies
nerdylive
nerdylive3mo ago
Try another pod? select the cuda version to the right one
Thibaud
Thibaud3mo ago
i tried on 4 different pod. for cuda version, i don't know where i can set it
nerdylive
nerdylive3mo ago
btw any logs? not set, more like filter when you create a pod
Thibaud
Thibaud3mo ago
i m trying pod not serverless. i don't see where in pod i can filter cuda
nerdylive
nerdylive3mo ago
No description
nerdylive
nerdylive3mo ago
ez just try 12.5
Thibaud
Thibaud3mo ago
thanks!
nerdylive
nerdylive3mo ago
try 12.4 if not just checked
Thibaud
Thibaud3mo ago
i used A40 so 12.4 i ll try with RTX6000 12.5 to check if i see a difference
Thibaud
Thibaud3mo ago
i don't understand why i don't see any processes here
No description
nerdylive
nerdylive3mo ago
Huh Does it means in maintanance?
Want results from more Discord servers?
Add your server