RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods

warhol

7/16/2024

How to Estimate the Survival Time of Spot Instances?

I need some advice on estimating the survival time of RunPod Spot instances. I've noticed that sometimes my Spot instances run for several hours without interruption, while other times they get terminated within minutes. This variability makes it challenging to choose between SPOT and ON-DEMAND.

frankvp11

7/16/2024

run a function in a pod

Suppose I had a function to do some computation, and I wanted to run that inside a pod - how would I go about doing that entirely from the python sdk?

artha77

7/15/2024

Impossible to launch a CPU Pods via API

when I try to launch a CPu pods via APi with it's id it just crash, with the graphql api it say : Pod resumed: { errors: [ {...

jumblejumble

7/15/2024

pod network down

My pod's network went down a while ago and still isn't back - k3c9sctuperq0u is the ID. Obviously I can't get logs or anything. Is there any way I can see when it might be fixed?

Jas

7/15/2024

Pod crashing due to low regular RAM?

Hey, I am running ComfyUI and my pod keeps crashing at one point in the workflow, the VRam is only at 70% utilised, but the GPU says 100% Does this mean if I found a different pod with more regular Ram, then I could keep going with the workflow?...

dms

7/13/2024

where is the stop icon??

i would like to pause my pod, but i can only terminate it??

accessor

7/13/2024

wasted all my credits trying to figure out how to actually initialize the GPUs in the pod instamces

I tried everything I can think of. installed all the nvidia drivers--everything I would do normally. Could not get any GPU to show as a device. I tried multiple preconfigured pods that said all ready to go but nothing seemed to work properly.

zkreutzjanz

7/12/2024

Multi Node training with torchrun/slurm

Has anyone here ever tried multinode on runpod? I am thinking of setting this up but if people have encountered prohibitive network speeds I do not see a reason to.

Asad Cognify

7/11/2024

How to get Public IP and set symmetrical port mapping on Pod via Python SDK

I have created a pod with python in the following way ```python runpod.api_key = os.getenv("RUNPOD_API_KEY") bot_name = 'Testing Pod Public IP 1'...

Stephen

7/11/2024

🆘 We've encountered a serious issue with the machines running in our production environment

🆘 We've encountered a serious issue with the machines running in our production environment on RunPod: the GPU utilization fluctuates wildly, sometimes even dropping to zero, which significantly slows down task execution. Who should I contact?

Gustavo Monti

7/10/2024

REST API with Ollama

Hello everyone, I installed ollama and trying to make some request do this API using my pod instance and port and I´m getting no results or 502. I´m using this tutorial: https://docs.runpod.io/tutorials/pods/run-ollama...

Myron

7/10/2024

Can't create pod via graphQL endpoint but works manually

I'm trying to create a new pod using a given template and networkvolume. I can do this using the website just fine however when I try to duplicate the exact same settings using the podRentInterruptable graphQL mutation I'm getting a There are no longer any instances available with the request specifications. Please try again later. error. Here is the mutation: ...

ktabrizi

7/9/2024

AMD pods don't properly support GPU memory allocation

Hello! I've been trying to build a ROCm/HIP-based package to run on RunPod's ROCm-templated pods (or in a custom-built container/template), and I ran into memory issues that I believe I've tracked down to how RunPod is starting up docker containers. In particular, pinned memory allocation fails with a misleading Error: Failed to allocate pinned memory: out of memory (2). Inspecting the GPU devices shows unusual permissions, e.g.: ``` ls -l /dev/dri/*...

🆁🅰🅻🅻🅴

7/9/2024

TheBloke/goliath-120b-GPTQ with RunPod Kobold AI United

Hi! I got goliath-120b-GPTQ running with 3 A40. But the text generation speed is extremely slow. What is the best option for GPU config and settings to run this model? Thank you in advance!...

PavelDonchenko

7/9/2024

Ollama stoped using GPU

I intalled ollama on pod as usual on 3090, by this tutorial: https://docs.runpod.io/tutorials/pods/run-ollama#step-4-interact-with-ollama-via-http-api. But now everything works very slowly. And GPU Memory Used is always on zero. What can be a reason?

tien

7/9/2024

Why all folders and files in workspace folder are lost?

When I was working on the pod, the connection to the pod was lost, all folders and files in workspace folder are lost although I didn't stop the pod.

Deeps__

7/9/2024

Unable to upgrade linux kernel version from 5.4.0 to 5.15.0 - RunPod A40 GPU

I'm trying to upgrade my linux kernel from version 5.4.0 to 5.15.0. This is required for me to train deep learning models. Here's what I tried 1. I tried to manually upgrade it with apt command. however I'm still getting the same kernel version...

robert

7/7/2024

Set a Hostname

Is it possible to set a hostname to an instance (pod) so that I can use DNS entries on Cloudflare to access my Pod?

최상현

7/7/2024

Jupyterlab not work

I'm having a big problem connecting to the runpot. I can't run jupyter lab on the connection. When I press that button, I get a cloudflare error page. Even if I ssh into it and run the start.sh file, the problem doesn't get better. ...

legend

7/7/2024

Can't connect to pod

I'm trying to connect to my pod, I see this message ``` -- RUNPOD.IO -- Enjoy your Pod #61iyyaw2aqv3io ^_^ ...

Previous Next

Gaming

Programming

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!