Kobold.cpp - Remote tunnel loads before the model, causing confusion (possible off-product issue)

Here's the log piece:
load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors: CPU model buffer size = 315.00 MiB
load_tensors: CUDA0 model buffer size = 32487.19 MiB
load_tensors: CUDA1 model buffer size = 32487.19 MiB
load_tensors: CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1
load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors: CPU model buffer size = 315.00 MiB
load_tensors: CUDA0 model buffer size = 32487.19 MiB
load_tensors: CUDA1 model buffer size = 32487.19 MiB
load_tensors: CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1
Not sure if it affects smaller text models, I tested official Kobold, seems to have the same issue. Did they move remote tunnel to async? Main issue with this - the link shows
502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared
502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared
everything functions properly once the model loads.
Solution:
Should be fixed
Jump to solution
10 Replies
nerdylive
nerdylive3w ago
maybe its more because of your app / library that serves your model if you create cloudflare tunnels in your pod it should work great
Henky!!
Henky!!2w ago
We are aware of this, its because we changed how those tunnels work and have a duplicate output by accident Will be fixed in a future version But moving the tunnels to async is exactly what it is, we added a feature that allows remote switching between model configs (Not exposed yet on runpod) and for that we need to keep the tunnels in their own process
nerdylive
nerdylive2w ago
Oh didn't see it had kobold in it lol nc
Solution
Henky!!
Henky!!2w ago
Should be fixed
Sinlore Kain
Sinlore KainOP2w ago
Fixed, no longer reproducible, thank you.
Sinlore Kain
Sinlore KainOP2w ago
Knew it! Moving those to async, tho, is really good.
Henky!!
Henky!!2w ago
Got reversed for people not using the new mode since they don't need it
Henky!!
Henky!!2w ago
If you want KoboldAI support I also recommend https://koboldai.org/discord where we hang out, I see that much sooner
Discord
Join the KoboldAI Discord Server!
This community is dedicated to the usage and development of KoboldAI's software, as well as broader text generation AI. | 12833 members
Henky!!
Henky!!2w ago
Or #KoboldCpp - Chatbots - Instruct - Story Writing - Adventure Games - OpenAI/Ollama API - Fast Bootup! since I do check that channel
Sinlore Kain
Sinlore KainOP2w ago
Joined, ty.

Did you find this page helpful?