RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods

Chilistick

3/25/2024

wget doesnt work on civitai models

I tried to use wget with link to download but it says unauthorized acess and nothing happens. It was a 5gb SD1.5 model. How do i fix this?

MarkaRagnos

3/24/2024

0 x 4090

Hello all, I'm sure this has been asked somewhere but I can't seem to find anything on this. I had a pod with 1 x 4090 which is now at 0. Is there a chance that I will ever be able to deploy that pod with a gpu again or to transfer the data on that pod to a new pod? I'm new to this so I'm not entirely sure how this system works.

Solution:

You can wait for the GPU to become available again, but that could take weeks or months, so better to create a new pod. RunPod allows you to start your pod with 0x GPU so that you can transfer your data to a new pod. You can also use network storage so that you don't have to worry about transferring your data in the case that someone else takes the GPU.

Furkan Gözükara SECourses

3/24/2024

A New Gold Tutorial For RunPod & Linux Users : How To Use Storage Network Volume In RunPod & Latest

A New Gold Tutorial For RunPod & Linux Users : How To Use Storage Network Volume In RunPod & Latest Version Of Automatic1111 With All ControlNet Models, InstantID & More : https://www.youtube.com/watch?v=8Qf4x3-DFf4

caseus

3/24/2024

Linux kernel version is 5.4.0

per accelerate: https://github.com/huggingface/accelerate/blob/85a75d4c3d0deffde2fc8b917d9b1ae1cb580eb2/src/accelerate/utils/other.py#L314C1-L331C1 ```...

Ercan

3/24/2024

How to scale pod GPU count properly?

Hello, we have some pods running with 2x4090, and what is the best way to increase this to e.g 4x4090 without making sure that our existing allocated gpus will not be taken by others even if we are running on-demand?

MarkyMc

3/24/2024

distributed training

Is it possible to set up a slurm cluster for distributed training on Runpod?

cinebam

3/24/2024

How can i bulk download all my images generated in my Output Folder

How can i bulk download all my images generated in my Output Folder (Fooocus)? I'm in the Jupyter Lab Browser but can only download single files. It is not possible to download folders or Zip all files and then download them!...

max zhang

3/23/2024

Data loss on pod

i rend pod with gpu type A5000, but suddenly my gpu type changed to rtx 3090 and all my data(150 gb) removed on original pod, the pod id remains and the whole progress is removed

SkyWhal3 🐋

3/22/2024

Upload files to Network volume? Two days spent on this and can't make it happen

HOW do I get my local safetensor LLM files on my PC to the network volume? Is the CLI the only way? I'm resorting to putting them on hugging face so I can access them that way via "theBloke LLM template". I'm not even sure if this will work but I have spent hours on google, and youtube trying to find out how. Sad.

ethan

3/21/2024

Shell asks for a password when I try to ssh to a secure cloud pod (with correct public key set)

I have a correctly formatted public key set, I have ssh enabled. Still asks for a password when I ssh in

Lucy_igg

3/21/2024

runpodctl create pod for CPU only

Hello, i try to create pod from cli width runpodctl create pod --gpuCount 0 but i have this error Error: required flag(s) "gpuType" not set. If i add a random gpuType, i have this error Error: There are no longer any instances available with the requested specifications. Has anyone managed to launch a CPU instance?...

Solution:

Not supported yet, but this is great feedback!

vinodt

3/21/2024

docker not found

Hello, I get an error from the container attempting to launch: /opt/nvidia/nvidia_entrypoint.sh: line 49: exec: docker: not found. It seems the system loading the container doesn't have docker? here is my template:...

snowmonkey

3/21/2024

How to mount network volume to the pod?

Hey all, I created a network volume and have a pod. How do I connect the network volume to the pod? There are so many unhelpful docs with broken links, no proper documentation about something that should be simple and straightforward....

PizzaDon111

3/21/2024

Securing Gradio App on Runpod with IP Whitelist

Hello, I'm running a Gradio app configured with share=True on Runpod. I can access it, but I'd like to improve security by whitelisting specific IP addresses. I've tried adding echo "python gradio_demo/app.py : 127.0.0.1, <MY IP>" >> /etc/hosts.allow, but this hasn't worked. Anyone could give me some advice? Thank you so much...

Fenix

3/21/2024

load a new network volumen into a pod?

I am new in runpod. Recenlty, I have created a network volume and try to load it in a GPU pod. While I use web terminal get in there and install a koboldcpp, I notices that root volume only have 10GB, another drive do have 5TB... but my network volume is jsut a 50GB. I am wonding if I really load it with the pod correctly....

Thick Thighs

3/21/2024

The Bloke LLM Template ExLlamaV2Cache_Q4 Error

Has anyone found a way around this. I use to use the pip install --upgrade exllamav2 command in the terminal but now that doesn't work. It worked yesterday but I'm guessing some things have changed and now it doesn't. The issue from what it seems has been going on for 2 or so weeks judging by the issues tab in github. https://github.com/TheBlokeAI/dockerLLM/issues/17. Using pip install --upgrade --no-deps exllamav2 solves it for now but that is only temporary I'm wondering if anyone has a update...

vinodt

3/21/2024

Hello, I have a docker image downloaded on to the pod. How to I use my custom image?

I don't think I can do a docker inside a docker, can I? I appreciate any guidance here!...

otakuhero

3/21/2024

Machine does not support exposing a TCP port

My prod pod needs to expose public network ports, so we used the template to configure "Expose TCP ports" when creating the pod. This has been working fine, but recently I found that there is such a pod that does not support "expose port". Is this normal? One of the podid is "bl1sbvy65vkwki"...

AndrewL

3/20/2024

Cannot Install JAX

Hello, I am currently unable to properly install JAX on both the A100 SXM 80GB and the H100 80GB SXM5 in the Secure Cloud. When I run the command pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html I get the following error (partially shown) that there are dependency conflicts with torch:...

Joshcandle

3/20/2024

GPU Name"NVIDIA RTX 4000 Ada Gene..."GPU 0"Error: CUDA unknown error - this may be due to an

POD ID: 86gy23dlljmxvj

gpu_diagnostics.json

Previous Next

Gaming

Programming

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!