RunPod•4mo ago

Issue with KoboldCPP - official template

I tried with two models (103b Midnight Miqu v1.0 and 123b Behemoth v1.1) in Q4 GGUF on a pod with the https://www.runpod.io/console/explore/2peen7lpau template. In both cases the models download successfully (2 files in both cases) When launching Kobold CPP the following error: Something possibly went wrong, stalling for 3 minutes before exiting so you can check for errors. The full logs are included. - The pod had 2x A40 48GB gpu with the default 125GB temporary container disk, and the default environment variables except for the model address. The KCPP args (default) should allow 2 GPUs if I understand correctly: --usecublas mmq --gpulayers 999 --contextsize 4096 --multiuser 20 --flashattention --ignoremissing Thanks a lot!

logs.txt

11 Replies

LiringlasOP•4mo ago

https://discord.com/channels/912829806415085598/1118945694863065230/1300828220907851806 There was a similar post but with no reply except for the guide which I followed in the first place 🙂

nerdylive•4mo ago

@Henky!! relevant to your template?

LiringlasOP•4mo ago

Thank you, I tried again following someone's instruction that I found on another discord. The KCCP_MODEL env variable is written a bit differently: https://huggingface.co/bartowski/Behemoth-123B-v1.1-GGUF/resolve/main/Behemoth-123B-v1.1-Q4_K_M/Behemoth-123B-v1.1-Q4_K_M-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1.1-GGUF/resolve/main/Behemoth-123B-v1.1-Q4_K_M/Behemoth-123B-v1.1-Q4_K_M-00002-of-00002.gguf?download=true, with no space after comma and "?download=true" at the end of both links, which I did not use the first time. This time it worked 🙂 Not sure what was the issue the first time. Is it the formatting of the variable?

Henky!!•4mo ago

Ah since we assisted in our discord but I can help The issue is that people try to fit models that don't fit Or use context that doesn't fit That model Q4_K_S I have succesfully tested on an A100 But people who try it on 2x48GB have been reporting it dooesn't fit especially if they use Q4_K_M Although that specific upload may also be broken This one launches for me succesfully on 1xA100 :

https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00002-of-00002.gguf?download=true

https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00001-of-00002.gguf?download=true,https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF/resolve/main/Behemoth-123B-v1-Q4_K_S/Behemoth-123B-v1-Q4_K_S-00002-of-00002.gguf?download=true

If you do go for split GPU deleting the image gen model after the fact can help since that adds a couple of gigabytes to the first GPU, runpod does not allow deleting it before making the pod due to a runpod bug Updated the error message to give that hint in the future @nerdylive Is there a way for me to know in hindsight how much ram that instance had? I wonder if its being task killed I can't reproduce it anymore so I suspect it was regular ram related, my latest change should make system ram irrelevant

nerdylive•4mo ago

I don't know maybe try using that specific same gpu 2x a40 in whichever cloud or DC they are using

Henky!!•4mo ago

The odd part is all of them were listed as 100GB ram for me so I'd expect that to fit even without the new optimization

nerdylive•4mo ago

Or maybe ask them to confirm what did they rent

LiringlasOP•4mo ago

Thanks a lot for your help, it did work in the last try where i used the same way of writing 🙂 I think this might be where I did a mistake the first time 🙂 I might work in IT myself but in the end even for us, the issue is most of the time located between the chair and the keyboard 😄

nerdylive•4mo ago

The air~

Henky!!•4mo ago

Nice that you got it working, if you want to hang out with the other koboldcpp users https://koboldai.org/discord

Discord

Join the KoboldAI Discord Server!

This community is dedicated to the usage and development of KoboldAI's software, as well as broader text generation AI. | 12381 members

LiringlasOP•4mo ago

Thanks a lot for your kind help, both of you!

Gaming

Programming

Issue with KoboldCPP - official template

Did you find this page helpful?