RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

There is no pod available

Hi!, all GPU Pods, whether secure or community are unavailable, no matter what filter you use. What's going on? Edit: Now it seems to be working, but the page is taking a long time to load, is there any maintenance work going on?...
Solution:
There is no maintenance otherwise everyone would be affected and not just you. Sounds like an issue with your internet connection.

wget not working inside the terminal for stable diffusion webUI

When I try to run the wget command to get models from civitai, it throws an error about username and password. I've watched many videos about it, and I seem to be doing everything right but I still can't get it to work

RTX 6000 Ada performance much worse than expected

From the NVidia specs, I would expect its performance to be on order of 10 - 20% slower than L40S. However, in my current training, I am finding it closer to 2X slower or worse. FP16 mixed precision training. Pretty bad considering price. Perhaps there is some other issue in how the pods or nodes are set up that could be worth looking into?

Slow model download speeds/bandwidth

Can anyone explain to me why the download speed is so bad from huggingface on Runpod? I consistently get 10-30 MB/s download speeds compared to 100+ MB/s on Vast.ai. I have often had to have instances running for 1-5 hours just to download LLama 3 70b or LLaVa 34b. Quite frankly, this issue is so bad that it has pushed me to vast.ai for most model training. The only issue I have seen with vast, is I can't select 5 GPUs instead of 4 or 8 which is the required amount for our use-case. Running batc...

Container Log From Saved Storage stuck on loading loop

I have saved Stable Diffusion storage Using the EU-SE-1 Server and cannot get it to complete loading after multiple attempts of waiting 15 minuets for it to load with the A40/RTX6000/RTX5000. I have tried deplying via 'Deploy GPU Pod' and from 'Storage.' How can I check to see where the issue is in Juypter labs or via terminal? Do I need to remove arguments or add any?

Server Volume Access

I'm using Runpod primarily to run the Stable Diffusion WebUI. I also set up a Server volume so that I could upload models and have them persist across any pod I create, but I can't seem to find out how to access the Volume storage rather than the temporary pod storage, how can I upload items to the volume so that they won't be deleted every time I terminate a pod?

Cloud Sync - "Something went wrong"

I have tried to set up Cloud sync with both Google cloud and Backblaze, and both have issues when I tried to sync. I get the same "Something went wrong" message. Sometimes if I just keep inputting my bucket data, again and again and again, eventually it starts syncing, even though I get the error message every time, other times like now it will just not sync at all...

Feature Request: `runpodctl send` TO specific machine & folder (ala SCP)

This can be achieved today by running: ``` runpodctl send foo ssh machine 'cd /workspace && runpoctl receive ...'...

SSH connection issue

anyone else have problems connecting to pods with SSH currently via TCP? i'm getting connection refused every time. connecting with ssh.runpod.io works though. i never had this issue before and tried via different networks already...

Better solution for 0 GPU stranded volumes

Since on-demand GPUs can get taken, would be great to have some better escape valves for getting our data off the volume. Right now, the 0.5 vcpu 512 MB RAM pod you give keeps killing my upload task. I would happily pay for more resources to speed up getting my data out. Would be nice to be able to attach a network volume to a pod after creation as well, or if you had cross-region network volumes. Network volume that only works in same region is of limited value, because a big reason for moving...

Kasmweb Runpod Desktop failing to connect

Hi there, I have tried to setting the runpod/kasm-docker:cuda11 multiple times now, however I have not been able to connect to the pod in any attempts. Upon clicking connect to HTTP service/terminal, a new web browser tab is opened, however the page fails to connect every time. Is this a known issue with the Kasmweb Runpod Desktops?...

pod terminate after command finishes

Hi folks -- it seems like if runpod notices that the entrypoint command for my pod finishes, it restarts the container and runs it again. is that expected, and is there any way to turn that off and have the pod terminate instead of re-running?

waiting for logs....

Hi, I wanted to start a RTX A 4000 pod with stable diffusion, but I got only "waiting for logs" for > 5 min... I've stopped after some time. is there an overload, or have I search the problem on my side. I'm new on runpod.io

Kohya_SS - Clicked "Start Training" button....how can i tell that it's working?

I'm running Kohya_ss through Runpod (via Stable Diffusion Kohya_ss ComfyUI Ultimate template). When I click "Start Training" the GUI gives me no indication that anything is happening. Because of how long this process takes it's hard to know whether an error happened or not. Everything I read seems to suggest that I should be able to see the training happening via the Terminal – if nothing else to confirm that activity is taking place and things are working. ...

GPU pods taking long time to install python packages

Horrible download speeds. it is actually disrupting my productivity as I have to wait for 2-3 hours for a few libraries to download. please look into this and kindly help
No description

GPU don't use

Use Runpod SD ComfyUI and ComfyUI - AI-Dock. But they don't use GPU. Secure Cloud 1 x RTX A4000. ID: h9zlhkckse9sx8. Region RO
No description

4090 GPUs in EU-RO-1 not available or with full memory

When starting up 4090s on this server region they get either stuck in "waiting for logs" before I can access them or their memory is full. please fix asap 🙏

While runnning a python file in my pod, I encounter a ModuleNotFound Error for tkinter

I have installed tkinter using pip and through apt-get install python3-tk . Still getting the error. I don't care about the gui as much as I care that my application completes (the application results are independent of gui tools). What can I do?

google colab image

I used the colab image available at us-docker.pkg.dev/colab-images/public/runtime:latest , the image works and gives the following logs , I added the port 9000 to the http port to expose in the pod settings , but it shows on the dialog after clicking on connect that the http service is not ready yet
No description

Switch off pod after 2 hours

Hello, I'm new with runpod, It seems like I didn't turn off my pod and it used up all my credit. How can I protect myself against it?...