Struggling with runpod unable to access htpp server or terminal error.

Hey guys! I have posted this same issue some time ago, but I cannot use any of the pods at all when I try to run Kobold AI with Fallen Llama or any other Model. The first time I used this I been able to access runpod normally, but after a few days I am dealing with the runpod issue ever since. I have this for my screenshots.
No description
No description
No description
42 Replies
Arsen32
Arsen32OP2w ago
Currently switching to a 2x RTX 4000 Ada to see if a lack of GPU's might be the issue. Edit: Wait nvm still same crap.
Arsen32
Arsen32OP2w ago
No description
Jason
Jason2w ago
It says oom Oom means you need better gpus with more ram Right @Henky!!
Henky!!
Henky!!2w ago
Out of memory indeed Without the rest of the log its hard to say
Jason
Jason2w ago
Which is fallen llama tho Oh it's a custom llama 3
Henky!!
Henky!!2w ago
Its a 70B
Jason
Jason2w ago
Ic
Henky!!
Henky!!2w ago
Which will fit on a single A40/A6000 if its Q4_K_S / Q4_K_M I suspect a to large quant was used
Henky!!
Henky!!2w ago
Verified the model can run
No description
Henky!!
Henky!!2w ago
Its apparently a Deepseek 70B Distill finetune Kobold auto picked it up as deepseek Link I used was https://huggingface.co/TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF/resolve/main/L33-Tiger-R1-70B-v1b-Q4_K_M.gguf?download=true , on a single A40 Its quite close so if you want more than 4K context you will have to go higher than the 1xA40
riverfog7
riverfog72w ago
With short context lengths yes
Arsen32
Arsen32OP2w ago
So my problem could be solved by potentially using max GPUs available and after that it would get the http running?
Henky!!
Henky!!2w ago
You need 48GB for Q4 at 4K on 70B's, you only went up to 32GB so it crashed
riverfog7
riverfog72w ago
try 2xA40 first
Arsen32
Arsen32OP2w ago
Hey river! I have finally had some time to get this done, and I switched to 2xA40. It did load everything, and the terminal started working, but the http service still didn't work. I ended up deleting the pod because I don't want to spend too much to maintain this. Here's the text logs.
riverfog7
riverfog72w ago
The log is cut tho only shows the installing part
Arsen32
Arsen32OP2w ago
Dang Can't access the pod to re-download now coz I got rid of it lol
riverfog7
riverfog72w ago
lol
Arsen32
Arsen32OP2w ago
Do you think the http service not ready issue I had earlier was mainly because of low ram or something? Happened for so many pods. I thought fixing the out of memory issue would deal with it but I got the same Http service is not ready issue
riverfog7
riverfog72w ago
what was your model context length like the one you set
Arsen32
Arsen32OP2w ago
I wasn't sure. Didn't pay attention tbh Just chose my template and the model and thought things would work out since they did before
Jason
Jason2w ago
Check your logs to figure out why the program isn't running, that's why port isn't ready
Henky!!
Henky!!2w ago
The log cuts off right before the context portion, either you didn't wait long enough or you put to much context and overflowed the GPU. How much context did you put? Or wait I can see it You put 65K context, which is probably to much for 2xA40 The single A40 could only handle 4K context at Q4_K_S And for most normal use cases your never going to hit 65K Interestingly 65536 does fit for me on the Q4_K_S link I posted Seems to fit in Q5_K_S as well so I think you actuallly shut it down right before it would have finished loading The main reason it took a while is that whatever pod you rented didn't have great download speeds, secure cloud downloads a lot faster The http only comes online once its 100% loaded
riverfog7
riverfog72w ago
how did u see it tho
Henky!!
Henky!!2w ago
Theres a line with all the parameters it got assigned
riverfog7
riverfog72w ago
oh you hav to download it lol im stupid
Henky!!
Henky!!2w ago
It was likely 1 minute away from working
riverfog7
riverfog72w ago
this
Henky!!
Henky!!2w ago
I get it though, download took over 10 minutes on what I assume was community cloud, probably only clicked the log after that seeing it had finished downloading not realizing it only just finished
Arsen32
Arsen32OP2w ago
The log said that everything was completely downloaded at the end, so I thought there was something wrong with the pod itself
Henky!!
Henky!!2w ago
Nope just didn't wait for it to load It shows cloudflare links when its done
Arsen32
Arsen32OP2w ago
I see. How should I know next time in the logs it actually finished loading everything? It was hard to tell
Henky!!
Henky!!2w ago
Your asking it to load 48GB of model so that always takes a bit of memory shuffling, its very obvious when its genuinely done
Arsen32
Arsen32OP2w ago
Do you still think A40 is good download speed wise? I am still new to all this stuff so dont know which ru pod has the best download speed. Can only afford the .44cents-.80cents per hour pods
Henky!!
Henky!!2w ago
No description
Henky!!
Henky!!2w ago
Absolutely, just not the community cloud stuff If it was secure cloud you had an unlucky pod since I got 600mb/s when I tested
Arsen32
Arsen32OP2w ago
It's so weird lol. The first time I used the runpod with a normal template in kobold ai it worked fine.
Henky!!
Henky!!2w ago
But just in case the runpod http proxy fails (Rarely happens but sometimes they have an outtage) we have those backup links I assume you used a model that fit the first time The last time you just shut it down right before it was done Was literally 1 minute away from working
Arsen32
Arsen32OP2w ago
Yeah, it was hard to tell if it was going to chug forever back then XD
Henky!!
Henky!!2w ago
It was copying to the first GPU when you closed it The loading is pretty quick, its just that it didn't seem that way because your download speeds didn't go above 80mb/s and even dropped to 30 at points The sweden and canada datacenters give me good speeds if you want to force it
Arsen32
Arsen32OP2w ago
Got it, thanks! I will try again later. Will let you know if I have some more trouble.
Henky!!
Henky!!2w ago
Hopefully I am online at the time since I built that template Otherwise https;//koboldai.org/discord has an active help channel Same link the template pointed you to when it crashed the first time

Did you find this page helpful?