Sinlore Kain
Sinlore Kain
RRunPod
Created by Sinlore Kain on 2/8/2025 in #⛅|pods
Kobold.cpp - Remote tunnel loads before the model, causing confusion (possible off-product issue)
Here's the log piece:
load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors: CPU model buffer size = 315.00 MiB
load_tensors: CUDA0 model buffer size = 32487.19 MiB
load_tensors: CUDA1 model buffer size = 32487.19 MiB
load_tensors: CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1
load_tensors: offloading 88 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 89/89 layers to GPU
load_tensors: CPU model buffer size = 315.00 MiB
load_tensors: CUDA0 model buffer size = 32487.19 MiB
load_tensors: CUDA1 model buffer size = 32487.19 MiB
load_tensors: CUDA2 model buffer size = 30636.42 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device CUDA0, buffer type CUDA0, backend CUDA0
Your remote Kobold API can be found at https://mock-up-cloudflare-link.trycloudflare.com/api
Your remote OpenAI Compatible API can be found at https://mock-up-cloudflare-link.trycloudflare.com/v1
======
Your remote tunnel is ready, please connect to https://mock-up-cloudflare-link.trycloudflare.com
.................................load_all_data: using async uploads for device CUDA1, buffer type CUDA1, backend CUDA1
Not sure if it affects smaller text models, I tested official Kobold, seems to have the same issue. Did they move remote tunnel to async? Main issue with this - the link shows
502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared
502 Bad Gateway
Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared
everything functions properly once the model loads.
16 replies