RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

Securing a POD with an API key

Any good resources or tutorials for walking a some-what beginner through the process of securing HTTPS API endpoints (port:11434) with an API key? I have a Pod running Ollama and serving API requests on port 11434, it is currently open and anyone with access to the url can use it. I haven't seen any malicious use, but would like to secure it by requiring an API key to access the endpoint. Thanks,...

Pod eternal image fetching

yf6hnl4zdwmvem - it's been fetching an image for 15 minutes US-TX-3 - the region of the problematic host Logs are just like: ```...

mi300x are unavailable

I've been using multiple mi300x on a single host for a while and all at sudden the resources have become unavailable. Is there anything that happened and will more resources become available?

error pulling image: Error response from daemon

A100 GPU on IE region is giving these errors when pulling the pytorch image ``` 024-11-21T14:08:37Z error pulling image: Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2024-11-21T14:08:39Z create container runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04...

A100 GPU vram being used

I have a pod running but one of my assigned GPUs has its vram taken up and I can't clear it even if after restarting the pod or torch.cuda.empty_cache
No description

My pod has suddenly disappeared but I'm still being charged for it

Because it's not on the webpage, I can't cancel it, but I can see that my credits are depleting. I can't seem to get any new pods either. What's going on? I was in the middle of downloading things view jupyter lab and then started getting strange connection problems. Before this point I also had an issue where I would open a pod with spot connection and it would be exited within minutes....

Two pods disappeared from the console

Two pods disappeared from the console and they are still billing. I opened a support ticket.

Why do you limit upload speed?

Like others, I'm also getting very slow upload speeds. Max that I can reach is exactly 1.0MB/s, which makes me suspect there's no reason other than there's a hard limit on the speeds from the server side. Why? We work daily with big files, from models to huge embeddings files. It is quite annoying paying a pod per hour and then have to spend a few hours only to upload data......

Automatic stop Pod after some time while using ollama

Hi everyone, As I wrote in the title, I would like my pod to "wake up" at 8AM from monday to friday and stop them when it's ollama endpoint is not triggered after 30 minutes. Is something like this possible?...

Can't view my ComfyUI workflow even though i exposed ports

I exposed some ports but i get 'Not ready' and when i try to access the ports , i get a Bad gateway error.The only port that opens is the 8888 (jupyter) port. I'm using the runpod pytouch template on the pod...
No description

Trouble comparing pods

Is there any way to compare different pod performance in terms of GB RAM , GB VRAM, and VCPU?

Starting a pod with runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04 has cuda version 12.6

I am confused what determines the cuda version of a pod I start. I would expect that when I start a docker image with a cuda version in the name that it has this cuda version bundled into the image and when I start the pod that this is the cuda version I see, but this is not the case. How can I start a pod with a predictable cuda version?

Broken pod

RunPod Pytorch 2.4.0 ID: qq5a8cbw7q0jms 2024-11-18T16:42:49Z create container runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04 2024-11-18T16:42:49Z image pull: runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04: pending 2024-11-18T16:42:57Z error creating container: container: create: container create: Error response from daemon: No such image: runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04...

No GPU available

Hi, I have been since yesterday without GPU's available for my pod with 3x4090, sometimes it has happened to me but after a few minutes I have been able to boot with a GPU. I use a virtual disc, I use SD and the other machine that uses flux if I can boot it, any solution or do you know why this happens?...

I am not using GPU, but someone else is occupying my GPU. What is the solution?

ID: xx5vmcdbbkab3m A100 *6 / 1 week service. When I first initialized it, it showed that someone else was occupying my GPU. How should I handle this?...
No description

"There are no longer any instances available with the requested specifications."

I've been getting this error a lot lately when trying to deploy a pod even when the GPU is listed as being available. This persists after multiple page refreshes. A feature request is to help with finding an available instance either by giving feedback with the error on the resource that's not matching the availability list (e.g. drive storage) or to allow filtering by more parameters so I can see what's actually available and source a machine that meets my needs.
No description

Stuck on Pod Initialization

Hi everyone, I’m new to RunPod and facing an issue while setting up a GPU pod. Every time I try to launch one, it gets stuck during initialization and shows “Waiting for logs,” but no logs are generated. I’ve tried using different servers, CPUs, and GPUs, but the problem persists across all scenarios. I would greatly appreciate any guidance or suggestions to resolve this issue. Thank you!...
No description
Next