R
RunPodβ€’2mo ago
Liringlas

Issue with KoboldCPP - official template

I tried with two models (103b Midnight Miqu v1.0 and 123b Behemoth v1.1) in Q4 GGUF on a pod with the https://www.runpod.io/console/explore/2peen7lpau template. In both cases the models download successfully (2 files in both cases) When launching Kobold CPP the following error: Something possibly went wrong, stalling for 3 minutes before exiting so you can check for errors. The full logs are included. - The pod had 2x A40 48GB gpu with the default 125GB temporary container disk, and the default environment variables except for the model address. The KCPP args (default) should allow 2 GPUs if I understand correctly: --usecublas mmq --gpulayers 999 --contextsize 4096 --multiuser 20 --flashattention --ignoremissing Thanks a lot!
11 Replies
Liringlas
LiringlasOPβ€’2mo ago
https://discord.com/channels/912829806415085598/1118945694863065230/1300828220907851806 There was a similar post but with no reply except for the guide which I followed in the first place πŸ™‚
nerdylive
nerdyliveβ€’2mo ago
@Henky!! relevant to your template?
Liringlas
LiringlasOPβ€’2mo ago
Thank you, I tried again following someone's instruction that I found on another discord. The KCCP_MODEL env variable is written a bit differently: https://huggingface.co/bartowski/Behemoth-123B-v1.1-GGUF/resolve/main/Behemoth-123B-v1.1-Q4_K_M/Behemoth-123B-v1.1-Q4_K_M-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1.1-GGUF/resolve/main/Behemoth-123B-v1.1-Q4_K_M/Behemoth-123B-v1.1-Q4_K_M-00002-of-00002.gguf?download=true, with no space after comma and "?download=true" at the end of both links, which I did not use the first time. This time it worked πŸ™‚ Not sure what was the issue the first time. Is it the formatting of the variable?
Henky!!
Henky!!β€’2mo ago
Ah since we assisted in our discord but I can help The issue is that people try to fit models that don't fit Or use context that doesn't fit That model Q4_K_S I have succesfully tested on an A100 But people who try it on 2x48GB have been reporting it dooesn't fit especially if they use Q4_K_M Although that specific upload may also be broken This one launches for me succesfully on 1xA100 :
https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00002-of-00002.gguf?download=true
https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00002-of-00002.gguf?download=true
If you do go for split GPU deleting the image gen model after the fact can help since that adds a couple of gigabytes to the first GPU, runpod does not allow deleting it before making the pod due to a runpod bug Updated the error message to give that hint in the future @nerdylive Is there a way for me to know in hindsight how much ram that instance had? I wonder if its being task killed I can't reproduce it anymore so I suspect it was regular ram related, my latest change should make system ram irrelevant
nerdylive
nerdyliveβ€’2mo ago
I don't know maybe try using that specific same gpu 2x a40 in whichever cloud or DC they are using
Henky!!
Henky!!β€’2mo ago
The odd part is all of them were listed as 100GB ram for me so I'd expect that to fit even without the new optimization
nerdylive
nerdyliveβ€’2mo ago
Or maybe ask them to confirm what did they rent
Liringlas
LiringlasOPβ€’2mo ago
Thanks a lot for your help, it did work in the last try where i used the same way of writing πŸ™‚ I think this might be where I did a mistake the first time πŸ™‚ I might work in IT myself but in the end even for us, the issue is most of the time located between the chair and the keyboard πŸ˜„
nerdylive
nerdyliveβ€’2mo ago
The air~
Henky!!
Henky!!β€’2mo ago
Nice that you got it working, if you want to hang out with the other koboldcpp users https://koboldai.org/discord
Discord
Join the KoboldAI Discord Server!
This community is dedicated to the usage and development of KoboldAI's software, as well as broader text generation AI. | 12381 members
Liringlas
LiringlasOPβ€’2mo ago
Thanks a lot for your kind help, both of you!
Want results from more Discord servers?
Add your server