RounMicLess Posts - Answer Overflow

RounMicLess

Posts Comments

RRunPod

•Created by RounMicLess on 1/13/2024 in #⛅｜pods-clusters

environment variable not accessible from true ssh ?

I see it when using fake ssh but not using the true ssh. I am not sure how to setup this.

8 replies

RRunPod

•Created by RounMicLess on 1/4/2024 in #⛅｜pods-clusters

ssh2 with node doesn't work correctly ?

Hello I am trying to connect to the gpu cloud using ssh2 via the [email protected] using a ssh key. It work using ssh.shell but not ssh.exec (it asks for PTY and when it is set, it doesn't no send any command). I don't know what to do because I faced this problem with runpod and I can yet connect using my linux terminal instead of going through my script) Any idea ?

6 replies

RRunPod

•Created by RounMicLess on 12/28/2023 in #⛅｜pods-clusters

Cuda out of memory

Hello, I am using the Runpod PyTorch 2.1. I am trying to train a small model (phi) about 1.5gb and whatever I do, I keep getting an error about Cuda out of memory from a process I don’t know where it comes from. I am using a 3090 gpu so I don’t understand where is the problem

8 replies

RRunPod

•Created by RounMicLess on 12/28/2023 in #⚡｜serverless

Général advices on the pricing and the use of server less

Hello, I am not sure how does it work exactly. So I have a few questions. I want to use the serveless service of runpod. If I correctly understood, a worker is waiting for an API call and I am going to pay for the time it needs to respond. For the first time (at the moment the worker wakes up), I am going to pay more because there is a delay time (in order to set up the docker image) ? Then until it goes idle, the setup is done ? So the strat is having one active worker ? Moreover, how should I handle the fact that I am using multiple big models. Like is there a difference between put the model in the docket image or, pulling it in the script with a side function ? Is it better to use a network volume ? Because I’ve seen that there is a lag when trying to get the data from a network volume. Moreover, since a network volume can loose its gpus, is there a rapid way to transfer models from a network volume in a specific region to another ? Thanks for your help

3 replies

Gaming

Programming