RunPod•2w ago

Struggling with runpod unable to access htpp server or terminal error.

Hey guys! I have posted this same issue some time ago, but I cannot use any of the pods at all when I try to run Kobold AI with Fallen Llama or any other Model. The first time I used this I been able to access runpod normally, but after a few days I am dealing with the runpod issue ever since. I have this for my screenshots.

42 Replies

Arsen32OP•2w ago

Currently switching to a 2x RTX 4000 Ada to see if a lack of GPU's might be the issue. Edit: Wait nvm still same crap.

Arsen32OP•2w ago

Jason•2w ago

It says oom Oom means you need better gpus with more ram Right @Henky!!

Henky!!•2w ago

Out of memory indeed Without the rest of the log its hard to say

Jason•2w ago

Which is fallen llama tho Oh it's a custom llama 3

Henky!!•2w ago

Its a 70B

Jason•2w ago

Henky!!•2w ago

Which will fit on a single A40/A6000 if its Q4_K_S / Q4_K_M I suspect a to large quant was used

Henky!!•2w ago

Verified the model can run

Henky!!•2w ago

Its apparently a Deepseek 70B Distill finetune Kobold auto picked it up as deepseek Link I used was https://huggingface.co/TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF/resolve/main/L33-Tiger-R1-70B-v1b-Q4_K_M.gguf?download=true , on a single A40 Its quite close so if you want more than 4K context you will have to go higher than the 1xA40

riverfog7•2w ago

With short context lengths yes

Arsen32OP•2w ago

So my problem could be solved by potentially using max GPUs available and after that it would get the http running?

Henky!!•2w ago

You need 48GB for Q4 at 4K on 70B's, you only went up to 32GB so it crashed

riverfog7•2w ago

try 2xA40 first

Arsen32OP•2w ago

Hey river! I have finally had some time to get this done, and I switched to 2xA40. It did load everything, and the terminal started working, but the http service still didn't work. I ended up deleting the pod because I don't want to spend too much to maintain this. Here's the text logs.

logs.txt

riverfog7•2w ago

The log is cut tho only shows the installing part

Arsen32OP•2w ago

Dang Can't access the pod to re-download now coz I got rid of it lol

riverfog7•2w ago

lol

Arsen32OP•2w ago

Do you think the http service not ready issue I had earlier was mainly because of low ram or something? Happened for so many pods. I thought fixing the out of memory issue would deal with it but I got the same Http service is not ready issue

riverfog7•2w ago

what was your model context length like the one you set

Arsen32OP•2w ago

I wasn't sure. Didn't pay attention tbh Just chose my template and the model and thought things would work out since they did before

Jason•2w ago

Check your logs to figure out why the program isn't running, that's why port isn't ready

Henky!!•2w ago

The log cuts off right before the context portion, either you didn't wait long enough or you put to much context and overflowed the GPU. How much context did you put? Or wait I can see it You put 65K context, which is probably to much for 2xA40 The single A40 could only handle 4K context at Q4_K_S And for most normal use cases your never going to hit 65K Interestingly 65536 does fit for me on the Q4_K_S link I posted Seems to fit in Q5_K_S as well so I think you actuallly shut it down right before it would have finished loading The main reason it took a while is that whatever pod you rented didn't have great download speeds, secure cloud downloads a lot faster The http only comes online once its 100% loaded

riverfog7•2w ago

how did u see it tho

Henky!!•2w ago

Theres a line with all the parameters it got assigned

riverfog7•2w ago

oh you hav to download it lol im stupid

Henky!!•2w ago

It was likely 1 minute away from working

riverfog7•2w ago

this

Henky!!•2w ago

I get it though, download took over 10 minutes on what I assume was community cloud, probably only clicked the log after that seeing it had finished downloading not realizing it only just finished

Arsen32OP•2w ago

The log said that everything was completely downloaded at the end, so I thought there was something wrong with the pod itself

Henky!!•2w ago

Nope just didn't wait for it to load It shows cloudflare links when its done

Arsen32OP•2w ago

I see. How should I know next time in the logs it actually finished loading everything? It was hard to tell

Henky!!•2w ago

Your asking it to load 48GB of model so that always takes a bit of memory shuffling, its very obvious when its genuinely done

Arsen32OP•2w ago

Do you still think A40 is good download speed wise? I am still new to all this stuff so dont know which ru pod has the best download speed. Can only afford the .44cents-.80cents per hour pods

Henky!!•2w ago

Absolutely, just not the community cloud stuff If it was secure cloud you had an unlucky pod since I got 600mb/s when I tested

Arsen32OP•2w ago

It's so weird lol. The first time I used the runpod with a normal template in kobold ai it worked fine.

Henky!!•2w ago

But just in case the runpod http proxy fails (Rarely happens but sometimes they have an outtage) we have those backup links I assume you used a model that fit the first time The last time you just shut it down right before it was done Was literally 1 minute away from working

Arsen32OP•2w ago

Yeah, it was hard to tell if it was going to chug forever back then XD

Henky!!•2w ago

It was copying to the first GPU when you closed it The loading is pretty quick, its just that it didn't seem that way because your download speeds didn't go above 80mb/s and even dropped to 30 at points The sweden and canada datacenters give me good speeds if you want to force it

Arsen32OP•2w ago

Got it, thanks! I will try again later. Will let you know if I have some more trouble.

Henky!!•2w ago

Hopefully I am online at the time since I built that template Otherwise https;//koboldai.org/discord has an active help channel Same link the template pointed you to when it crashed the first time

Gaming

Programming

Struggling with runpod unable to access htpp server or terminal error.

Did you find this page helpful?