RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

wget doesnt work on civitai models

I tried to use wget with link to download but it says unauthorized acess and nothing happens. It was a 5gb SD1.5 model. How do i fix this?

0 x 4090

Hello all, I'm sure this has been asked somewhere but I can't seem to find anything on this. I had a pod with 1 x 4090 which is now at 0. Is there a chance that I will ever be able to deploy that pod with a gpu again or to transfer the data on that pod to a new pod? I'm new to this so I'm not entirely sure how this system works.
Solution:
You can wait for the GPU to become available again, but that could take weeks or months, so better to create a new pod. RunPod allows you to start your pod with 0x GPU so that you can transfer your data to a new pod. You can also use network storage so that you don't have to worry about transferring your data in the case that someone else takes the GPU.

A New Gold Tutorial For RunPod & Linux Users : How To Use Storage Network Volume In RunPod & Latest

A New Gold Tutorial For RunPod & Linux Users : How To Use Storage Network Volume In RunPod & Latest Version Of Automatic1111 With All ControlNet Models, InstantID & More : https://www.youtube.com/watch?v=8Qf4x3-DFf4

How to scale pod GPU count properly?

Hello, we have some pods running with 2x4090, and what is the best way to increase this to e.g 4x4090 without making sure that our existing allocated gpus will not be taken by others even if we are running on-demand?

distributed training

Is it possible to set up a slurm cluster for distributed training on Runpod?

How can i bulk download all my images generated in my Output Folder

How can i bulk download all my images generated in my Output Folder (Fooocus)? I'm in the Jupyter Lab Browser but can only download single files. It is not possible to download folders or Zip all files and then download them!...

Data loss on pod

i rend pod with gpu type A5000, but suddenly my gpu type changed to rtx 3090 and all my data(150 gb) removed on original pod, the pod id remains and the whole progress is removed

Upload files to Network volume? Two days spent on this and can't make it happen

HOW do I get my local safetensor LLM files on my PC to the network volume? Is the CLI the only way? I'm resorting to putting them on hugging face so I can access them that way via "theBloke LLM template". I'm not even sure if this will work but I have spent hours on google, and youtube trying to find out how. Sad.

Shell asks for a password when I try to ssh to a secure cloud pod (with correct public key set)

I have a correctly formatted public key set, I have ssh enabled. Still asks for a password when I ssh in

runpodctl create pod for CPU only

Hello, i try to create pod from cli width runpodctl create pod --gpuCount 0 but i have this error Error: required flag(s) "gpuType" not set. If i add a random gpuType, i have this error Error: There are no longer any instances available with the requested specifications. Has anyone managed to launch a CPU instance?...
Solution:
Not supported yet, but this is great feedback!

docker not found

Hello, I get an error from the container attempting to launch: /opt/nvidia/nvidia_entrypoint.sh: line 49: exec: docker: not found. It seems the system loading the container doesn't have docker? here is my template:...
No description

How to mount network volume to the pod?

Hey all, I created a network volume and have a pod. How do I connect the network volume to the pod? There are so many unhelpful docs with broken links, no proper documentation about something that should be simple and straightforward....

Securing Gradio App on Runpod with IP Whitelist

Hello, I'm running a Gradio app configured with share=True on Runpod. I can access it, but I'd like to improve security by whitelisting specific IP addresses. I've tried adding echo "python gradio_demo/app.py : 127.0.0.1, <MY IP>" >> /etc/hosts.allow, but this hasn't worked. Anyone could give me some advice? Thank you so much...

load a new network volumen into a pod?

I am new in runpod. Recenlty, I have created a network volume and try to load it in a GPU pod. While I use web terminal get in there and install a koboldcpp, I notices that root volume only have 10GB, another drive do have 5TB... but my network volume is jsut a 50GB. I am wonding if I really load it with the pod correctly....

The Bloke LLM Template ExLlamaV2Cache_Q4 Error

Has anyone found a way around this. I use to use the pip install --upgrade exllamav2 command in the terminal but now that doesn't work. It worked yesterday but I'm guessing some things have changed and now it doesn't. The issue from what it seems has been going on for 2 or so weeks judging by the issues tab in github. https://github.com/TheBlokeAI/dockerLLM/issues/17. Using pip install --upgrade --no-deps exllamav2 solves it for now but that is only temporary I'm wondering if anyone has a update...

Hello, I have a docker image downloaded on to the pod. How to I use my custom image?

I don't think I can do a docker inside a docker, can I? I appreciate any guidance here!...

Machine does not support exposing a TCP port

My prod pod needs to expose public network ports, so we used the template to configure "Expose TCP ports" when creating the pod. This has been working fine, but recently I found that there is such a pod that does not support "expose port". Is this normal? One of the podid is "bl1sbvy65vkwki"...
No description

Cannot Install JAX

Hello, I am currently unable to properly install JAX on both the A100 SXM 80GB and the H100 80GB SXM5 in the Secure Cloud. When I run the command pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html I get the following error (partially shown) that there are dependency conflicts with torch:...