TheBloke/goliath-120b-GPTQ with RunPod Kobold AI United
Hi! I got goliath-120b-GPTQ running with 3 A40. But the text generation speed is extremely slow. What is the best option for GPU config and settings to run this model?
Thank you in advance!
6 Replies
It only uses one GPU?
H100 iguess
No it uses multiple gpu
What gpu are you using btw
3 x A40
Why does it only uses 1 gpu for me? Some setting i missed?
I don't know, try checking on the docs of your library
And some Readme if they have
I mean it's quite huge model, no wonder it's slow