Can't connect to pod
I'm trying to connect to my pod, I see this message
In my logs, I see a bunch of these
So I assume there's some issue with starting the container? I don't know what to do here.
29 Replies
Fwiw: In case of XY question
I want to have a runpod that just has the latest cuda image
You need to keep your container alive with
sleep infinity
otherwise it will get into a restart loop.Where do I set that?
Would it be here?
Yes , add this:
thank you, ill try it!
Is this the right way to go about creating a development environment?
I'm currently away from my main PC, so I don't have access to cuda on my laptop
aha, it works!
I prefer creating custom docker images
Otherwise the docker override command gets very messy and ugly to work with
I want to see if I can get everything working with this, otherwise I'll have to go that route 😅
Any particular reason you want Rocky Linux and not Ubuntu Linux?
I just copied from nvidia's website
By the way, I don't recommend using CUDA 12.5, not all machines will be CUDA 12.5
They are all 12.1 or higher so best to use 12.1 as CUDA version
This setup seems to work with it
Unfortunately I can't use 12.1
Iirc I need to use 12.3
But maybe it works with an older version, I'll have to mess around and find out
You were probably lucky and got a machine that has CUDA 12.5, but thats very new so chances of getting a host machine that supports 12.5 is very kow
Ahh, got it
Well, let's see if I can get stuff working here
You also won't really know whether its working or not until you try to use torch
then I'll try ubuntu -> downgrading cuda
I'm not going to be using torch
If you get an error saying something like "forward compatibility" blah blah then you know its because its not running CUDA 12.5
Oh, what are you using?
This is completely unrelated to python
I'm working remotely right now, and I'm trying alternatives to SSH'ing into my computer at home
I work directly with CUDA
I'm just seeing the options I have
Otherwise I'm just going to buy some laptop with cuda :lel:
Oh I see, yeah sounds like investing in a CUDA machine may be better option, but worth weighing up your options for sure.
Yeah, I do have a CUDA machine at home
But tbch, it's not a great setup (for using remotely)
That wasn't the intent when setting it up
Only issue if I buy a cuda laptop is that now I have to carry around 2 laptops
My current laptop is already chonky :KEKW:
oh yeah if you travel a lot, thats definitely not ideal
yep
Ok so I'm trying to use ubuntu
2024-07-07T12:06:20Z create container nvcr.io/nvidia/12.5.0-devel-ubuntu22.04 2024-07-07T12:06:23Z error pulling image: Error response from daemon: unauthorized: authentication required 2024-07-07T12:06:23Z error creating container: unauthorized to use image nvcr.io/nvidia/12.5.0-devel-ubuntu22.04
Try this one instead from Docker hub:
nvidia/cuda:12.5.0-devel-ubuntu22.04
ok this is going to be a really dumb question
what is the base url?
I've never really worked with docker / all this cloud stuff
I'm a "run it locally and dont worry about it" type of guy :pepe_laugh:
ok yeah im completely confused by this
what URL do I actually set?
base url for what?
The image
I can’t seem to get the right image
Hm, I've gotten it working now @digigoblin
But for some reason I can't connect through ssh
You need to ensure openssh service is installed and started
not sure how but now it's all working
Final question - if I stop the pod and start it later, will changes to the filesystem persist?
or packages i've installed?
Nevermind
dang
can't use runpod
I need SFTP
You can use SFTP on RunPod
Just use the filter at the top of the page to ensure that your pod gets a public IP if you're using community cloud