RunPod•3mo ago

Kobold.cpp - Remote tunnel loads before the model, causing confusion (possible off-product issue)

Here's the log piece:

load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors:          CPU model buffer size =   315.00 MiB
load_tensors:        CUDA0 model buffer size = 32487.19 MiB
load_tensors:        CUDA1 model buffer size = 32487.19 MiB
load_tensors:        CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1

load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors:          CPU model buffer size =   315.00 MiB
load_tensors:        CUDA0 model buffer size = 32487.19 MiB
load_tensors:        CUDA1 model buffer size = 32487.19 MiB
load_tensors:        CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1

Not sure if it affects smaller text models, I tested official Kobold, seems to have the same issue. Did they move remote tunnel to async? Main issue with this - the link shows

 502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared

 502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared

everything functions properly once the model loads.

Solution:

Should be fixed

Jump to solution

10 Replies

Jason•3mo ago

maybe its more because of your app / library that serves your model if you create cloudflare tunnels in your pod it should work great

Henky!!•3mo ago

We are aware of this, its because we changed how those tunnels work and have a duplicate output by accident Will be fixed in a future version But moving the tunnels to async is exactly what it is, we added a feature that allows remote switching between model configs (Not exposed yet on runpod) and for that we need to keep the tunnels in their own process

Jason•3mo ago

Oh didn't see it had kobold in it lol nc

Solution

Henky!!•3mo ago

Should be fixed

Sinlore KainOP•3mo ago

Fixed, no longer reproducible, thank you.

Sinlore KainOP•3mo ago

Knew it! Moving those to async, tho, is really good.

Henky!!•3mo ago

Got reversed for people not using the new mode since they don't need it

Henky!!•3mo ago

If you want KoboldAI support I also recommend https://koboldai.org/discord where we hang out, I see that much sooner

Discord

Join the KoboldAI Discord Server!

This community is dedicated to the usage and development of KoboldAI's software, as well as broader text generation AI. | 12833 members

Henky!!•3mo ago

Or #KoboldCpp - Chatbots - Instruct - Story Writing - Adventure Games - OpenAI/Ollama API - Fast Bootup! since I do check that channel

Sinlore KainOP•3mo ago

Joined, ty.

Gaming

Programming

Kobold.cpp - Remote tunnel loads before the model, causing confusion (possible off-product issue)

Did you find this page helpful?