TheBloke/goliath-120b-GPTQ with RunPod Kobold AI United

Hi! I got goliath-120b-GPTQ running with 3 A40. But the text generation speed is extremely slow. What is the best option for GPU config and settings to run this model? Thank you in advance!
6 Replies
🆁🅰🅻🅻🅴
It only uses one GPU?
nerdylive
nerdylive3mo ago
H100 iguess No it uses multiple gpu What gpu are you using btw
🆁🅰🅻🅻🅴
3 x A40 Why does it only uses 1 gpu for me? Some setting i missed?
nerdylive
nerdylive3mo ago
I don't know, try checking on the docs of your library And some Readme if they have I mean it's quite huge model, no wonder it's slow
Want results from more Discord servers?
Add your server