R
RunPod•5mo ago
legend

Can't connect to pod

I'm trying to connect to my pod, I see this message
-- RUNPOD.IO --
Enjoy your Pod #61iyyaw2aqv3io ^_^

Error response from daemon: Container 3be7cf829496b24ddcf4e98eb16e2a1a24e629e05c1ceb388fa6fa4d4555b239 is not running
Connection to 100.65.21.86 closed.
Connection to ssh.runpod.io closed.
-- RUNPOD.IO --
Enjoy your Pod #61iyyaw2aqv3io ^_^

Error response from daemon: Container 3be7cf829496b24ddcf4e98eb16e2a1a24e629e05c1ceb388fa6fa4d4555b239 is not running
Connection to 100.65.21.86 closed.
Connection to ssh.runpod.io closed.
In my logs, I see a bunch of these
2024-07-07T11:50:09Z start container
2024-07-07T11:50:25Z start container
2024-07-07T11:50:41Z start container
2024-07-07T11:50:57Z start container
2024-07-07T11:51:13Z start container
2024-07-07T11:51:29Z start container
2024-07-07T11:51:45Z start container
2024-07-07T11:52:01Z start container
2024-07-07T11:52:17Z start container
2024-07-07T11:52:33Z start container
2024-07-07T11:50:09Z start container
2024-07-07T11:50:25Z start container
2024-07-07T11:50:41Z start container
2024-07-07T11:50:57Z start container
2024-07-07T11:51:13Z start container
2024-07-07T11:51:29Z start container
2024-07-07T11:51:45Z start container
2024-07-07T11:52:01Z start container
2024-07-07T11:52:17Z start container
2024-07-07T11:52:33Z start container
So I assume there's some issue with starting the container? I don't know what to do here.
29 Replies
legend
legendOP•5mo ago
Fwiw: In case of XY question I want to have a runpod that just has the latest cuda image
digigoblin
digigoblin•5mo ago
You need to keep your container alive with sleep infinity otherwise it will get into a restart loop.
legend
legendOP•5mo ago
Where do I set that?
legend
legendOP•5mo ago
Would it be here?
No description
digigoblin
digigoblin•5mo ago
Yes , add this:
bash -c 'sleep infinity'
bash -c 'sleep infinity'
legend
legendOP•5mo ago
thank you, ill try it! Is this the right way to go about creating a development environment? I'm currently away from my main PC, so I don't have access to cuda on my laptop aha, it works!
digigoblin
digigoblin•5mo ago
I prefer creating custom docker images Otherwise the docker override command gets very messy and ugly to work with
legend
legendOP•5mo ago
I want to see if I can get everything working with this, otherwise I'll have to go that route 😅
digigoblin
digigoblin•5mo ago
Any particular reason you want Rocky Linux and not Ubuntu Linux?
legend
legendOP•5mo ago
I just copied from nvidia's website
digigoblin
digigoblin•5mo ago
By the way, I don't recommend using CUDA 12.5, not all machines will be CUDA 12.5 They are all 12.1 or higher so best to use 12.1 as CUDA version
legend
legendOP•5mo ago
This setup seems to work with it Unfortunately I can't use 12.1 Iirc I need to use 12.3 But maybe it works with an older version, I'll have to mess around and find out
digigoblin
digigoblin•5mo ago
You were probably lucky and got a machine that has CUDA 12.5, but thats very new so chances of getting a host machine that supports 12.5 is very kow
legend
legendOP•5mo ago
Ahh, got it Well, let's see if I can get stuff working here
digigoblin
digigoblin•5mo ago
You also won't really know whether its working or not until you try to use torch
legend
legendOP•5mo ago
then I'll try ubuntu -> downgrading cuda I'm not going to be using torch
digigoblin
digigoblin•5mo ago
If you get an error saying something like "forward compatibility" blah blah then you know its because its not running CUDA 12.5 Oh, what are you using?
legend
legendOP•5mo ago
This is completely unrelated to python I'm working remotely right now, and I'm trying alternatives to SSH'ing into my computer at home I work directly with CUDA I'm just seeing the options I have Otherwise I'm just going to buy some laptop with cuda :lel:
digigoblin
digigoblin•5mo ago
Oh I see, yeah sounds like investing in a CUDA machine may be better option, but worth weighing up your options for sure.
legend
legendOP•5mo ago
Yeah, I do have a CUDA machine at home But tbch, it's not a great setup (for using remotely) That wasn't the intent when setting it up Only issue if I buy a cuda laptop is that now I have to carry around 2 laptops My current laptop is already chonky :KEKW:
digigoblin
digigoblin•5mo ago
oh yeah if you travel a lot, thats definitely not ideal
legend
legendOP•5mo ago
yep Ok so I'm trying to use ubuntu
2024-07-07T12:06:20Z create container nvcr.io/nvidia/12.5.0-devel-ubuntu22.04 2024-07-07T12:06:23Z error pulling image: Error response from daemon: unauthorized: authentication required 2024-07-07T12:06:23Z error creating container: unauthorized to use image nvcr.io/nvidia/12.5.0-devel-ubuntu22.04
digigoblin
digigoblin•5mo ago
Try this one instead from Docker hub: nvidia/cuda:12.5.0-devel-ubuntu22.04
legend
legendOP•5mo ago
ok this is going to be a really dumb question what is the base url? I've never really worked with docker / all this cloud stuff I'm a "run it locally and dont worry about it" type of guy :pepe_laugh: ok yeah im completely confused by this what URL do I actually set?
digigoblin
digigoblin•5mo ago
base url for what?
legend
legendOP•5mo ago
The image I can’t seem to get the right image Hm, I've gotten it working now @digigoblin But for some reason I can't connect through ssh
digigoblin
digigoblin•5mo ago
You need to ensure openssh service is installed and started
legend
legendOP•5mo ago
not sure how but now it's all working Final question - if I stop the pod and start it later, will changes to the filesystem persist? or packages i've installed? Nevermind dang can't use runpod I need SFTP
digigoblin
digigoblin•5mo ago
You can use SFTP on RunPod Just use the filter at the top of the page to ensure that your pod gets a public IP if you're using community cloud
Want results from more Discord servers?
Add your server