RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

Unable to upgrade linux kernel version from 5.4.0 to 5.15.0 - RunPod A40 GPU

I'm trying to upgrade my linux kernel from version 5.4.0 to 5.15.0. This is required for me to train deep learning models. Here's what I tried 1. I tried to manually upgrade it with apt command. however I'm still getting the same kernel version...

Set a Hostname

Is it possible to set a hostname to an instance (pod) so that I can use DNS entries on Cloudflare to access my Pod?

Jupyterlab not work

I'm having a big problem connecting to the runpot. I can't run jupyter lab on the connection. When I press that button, I get a cloudflare error page. Even if I ssh into it and run the start.sh file, the problem doesn't get better. ...
No description

Can't connect to pod

I'm trying to connect to my pod, I see this message ``` -- RUNPOD.IO -- Enjoy your Pod #61iyyaw2aqv3io ^_^ ...

why it download bar does not show in some browsers. I have no idea how long will it take to download

why it download bar does not show in some browsers. I have no idea how long will it take to download

Problems building docker image using /workspace as root

Hello. I am trying to build my docker containers on runpod. I am using a Network Storage on the volume /workspace. I installed docker, and set the root as /workspace/docker/ because it will take a lot of space It was taking a long time, and we noticed the storage-driver was "vfs". So we tried to change it to "fuse-overlayfs" and seem to work, but when pulling Redis files (one of my containers is for Redis), it gave this error: ``` failed to register layer: using mount program fuse-overlayfs: fuse: device not found, try 'modprobe fuse' first...

Fooocus loads the sdxl model too slowly

The time it takes to load the model has gone from 8 seconds to 30 seconds now. Does this have anything to do with me using Network Volume?

Fooocus generates images too slowly

The image inference speed generated by Fooocus dropped from 6it/s to 2it/s. With the same RTX4090, what did RunPod do?
Solution:
Replacing the Pod solves the problem

Can't use Fooocus to run on an open port

Hey, anyone, has anyone encountered the same problem?
Solution:
Listen on 0.0.0.0
No description

NansException: A tensor with all NaNs was produced in Unet.

hi,after i upload a model from huggingface ,my sd cant generate anymore ,it says "NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this chec...

how to change disk size when deploying pods?

I found the default size is 20G in documents, when I use pip install something,that’s too small for my project, so where to change it
Solution:
Before deploying pod, click edit template

Pod ssh configuration docs out of date?

I've been having issues setting up ssh for new pods using the current documentation. After some investigation, it looks like the ssh pub key environment variable has changed names from RUNPOD_SSH_PUBLIC_KEY to PUBLIC_KEY. I ran the following script as a startup command to discover: ```bash...

How can I run a pod as an API endpoint? (not serverless)

I want to run a model as an api endpoint
Solution:
Flask, FastAPI etc, but it can't scale so its not recommended to use an API on pods. its much better to use serverless.

can't start pod using the cli

I'm trying to start a pod using the cli by doing:
runpodctl start pod <id>
runpodctl start pod <id>
but I'm getting the error: Error: Cannot resume a spot pod as an on demand pod....
Solution:
I found the problem, the graphql query in the source code for runpodctl is missing the gpuCount param, ran it manually with the extra param and it worked

Comfyui not using GPU, how to fix this?

We are running in instance of Comfyui and it's using CPU and not GPU, using a 4090. How can this be fixed?

What are the rules that RunPod follows to cache DockerHub images?

I did some more testing today and it seems like RunPod sometimes holds a copy of my docker image, which leads to the fast load I'm looking for. Is there any predictability to this? Like, is it cached for a specified amount of time?

Can I cache my DockerHub image in RunPod storage?

My image is 5.8GB in DockerHub. This includes a 2.5GB folder of assets I need for my workload and the rest is just packages. I timed it and it took about 2 minutes to download into my RunPod container. Is there a way to cache my Docker image so that I can reduce this download speed?

Is there a way to add storage volume to a pod after creation?

Hi, I have worked on this pod for a while but I did not add the pod under a volume to increase the storage space. Is there a way to add it now?
Solution:
No neither a way to remove it from a pod too

how to remove network volume from pod

hey how can i remove the network volume from a pod. I cant delete it without removing it but i dont see any options in the ui. how to do it?
Solution:
You cant like disconnect the network volume from a pod