RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

Running on local URL but can't access from outside

Hi guys, I have this service running in my pod but I can't seem to access it through the intended way. I have the service ready Running on local URL: http://127.0.0.1:7860 I checked using curl from the pod on localhost and I get the interface....
Solution:
You don't need to edit the code.
export GRADIO_SERVER_NAME="0.0.0.0"
export GRADIO_SERVER_NAME="0.0.0.0"
...
No description

How can I use ollama Docker image?

Hello. I've been trying to serve ollama on RunPod using ollama Docker image (https://hub.docker.com/r/ollama/ollama) but haven't found a way to run it. I tried using the docker run ... command in the Container Start Command input but I encountered an error: unknown command "docker" for "ollama". Does anyone know the correct method to use ollama on RunPod?...

Comfyui won't run because of the missing NVidia drivers

yesterday my comfyui worked well, now when i try to launch it i get the error: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx...
No description

RunPod Library + API

So I am attempting to create an API to either start/stop an existing pod or create a pod and then start/stop, I currently have something somewhat working: ```python @app.route("/start_model", methods=["POST"]) def start_model():...

Cuda Driver

I am getting the following error. I was setting up a new runpod, since the one I setup last night was taken when I paused it. Now when I try to start the container I am getting the error displayed in the image.
Solution:
thank you sir! that resolvedi t
No description

How do I start a pod with a private docker image (template) using GraphQL?

I am trying to start pods with the graphql api, running my private image. In the docs it says this: "If your container image is private, you can also specify Docker login credentials with a containerRegistryAuthId argument, which takes the ID (not the name) of the container registry credentials you saved in your RunPod user settings as a string." But I can simply not find this ID anywhere in my settings. How do I do it?...

I just re-initialized a suspended pod and now I don't have gpu drivers

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
This is a simple pytorch pod with only comfyui installed. It was fine this morning and now when I run python3 main.py --listen 0.0.0.0 --port 8188...

Assistance Requested for Pod Initialization Issue

Hello @SupportTeam and RunPod community, I'm reaching out regarding an issue with starting pods that seems to be a common problem for several users here. Like others, my pod is experiencing prolonged "image pull pending" and initiation loops, making it impossible to use the service effectively. I reached out to chat support and sadly the chat was abandoned by the support member. ...
Solution:
Official Ubuntu template or Docker image? You should use the RunPod H100 PyTorch image, its the only one that has Torch custom built for H100. Others will not work properly.
No description

Overcharged for Pod.

I had a pod running at a fee of $0.34 per hour. I stopped the pod but was still charged the full fee instead of the reduced rate and it consumed all my balance. How can this be resolved and have my balance refunded?...

Missing port buttons and Unable to “start web terminal” on Ultimate Template

Hello, I've been trying to install @ashleyk Stable Diffusion Kohya_ss ComfyUI Ultimate 3.12.1 all afternoon with no success. Typically, I can be up and running on a new pod in ~10-20 minutes; today, it was taking much longer. The log would often hang for ~10-15 minutes during each step to Sync (kohya, comfyui, Automatic) Venv. Once the installation was complete, I clicked the connect button to start Auto1111, comfyUI, Jupyter, etc., but the UI service buttons were missing. I tried installing this container on an a40, a6000, and a5000. I burned through $2 just on unsuccessful installation time. Today is the first time anything like this has occurred. I genuinely appreciate your work @ashleyk . I'd love to learn how to create or tweak the current container....
No description

Any recent firewall changes?

Were there any recent firewall changes in the last few days? Seeing urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) when interacting with HF hub. Replicated by other people as well, on different machines.

Becoming a host MI250

Hello, Do you onboard hosts with AMD machines?...

GPU Pod was down all the night

Hi, we just woke up to a production issue where our all apis were down because our pods just shut down and looks like restarted for some reason, and when we looked at we sat maintenance scheduled text for next week. Can someone help what was the issue, and why it went down itself ? Pod ID: clxu7lem3ph9xu...

H100 cluster group compilation error

I use RunPod Desktop on Secure H100 (both SXM5 and PCI3). CUDA Driver Version / Runtime Version 12.1 / 11.8 CUDA Capability Major/Minor version number: 9.0 I am trying to use cluster groups but having trouble in compiling files....

Stuck in creating container

No matter how I had set up the pod, it would be stuck in starting container, dozens, 20, times. Then I saw that it was choosing the "location" I left as any because I dont really care. Turns out, it was always redirecting to a specific place and not starting. I tried manually choose the location, and turned out no matter which option on the available setup I had chosen, they were only working part creating container on TX. But then get stuck in "image pull pending". At least on the TX, it did not pretended to run and consumed my credit without actually running. Still, not running at all....

Custom template bash: /start.sh: No such file or directory

I.m triyng to run custon docer template nvidia/cuda:12.0.1-devel-ubuntu20.04 with container start command bash -c /start.sh but it isn't work 2024-02-03T21:11:28.730786952Z bash: /start.sh: No such file or directory

Why are my model files only 135 bytes after a clone repository on Pytorch template?

Each safetensor file is only 135 bytes, despite me cloning the respository. All the other smaller files downloaded correctly though. I got around this by just uploading each safetensor file manually from my desktop (took like 30 mins tho), but I was just curious for the future. This is the prompt command I used:...
Solution:
Also: git lfs install...
No description

I cannot connect to server using Web Terminal. It says 'Connection Closed'

When i start a new server the webterminal works. HOwever after the first reboot it no longer works and says 'Connection Closed'

Proxy Url related info

i am having error CNAME Cross-User Banned when try to register runpod proxy url in cloudflare cname dns record...
Solution:
use cloudflared tunnels and your own domain