why don't I have a stop option, only terminate option available
Solution:Jump to solution
I would use rclone rather than cloud sync. Cloud sync is built on stop of rclone anyway.
52 Replies
You don't get a stop button when you use a network volume because its redundant. You can simply create a new pod with the network volume attached.
do you mean create a new pod with local disk volume attached?
I mean if you use network storage there is no point of stoping pod as you can reuse storage on other machine
but I am trying to stop the pod itself not the volume.
Just stop the pod
Terminate ir
And remake it later with a network storage attached
ok
There's no way for it for now in your case, if you use the storage from the pod only then it's possible
then I think I need to make a pod with local disk volume, which I can stop it any time. and restart it to access the data on the local disk volume. does it help me to transfer 200GB data faster from aws instance to runpod
Yes but it's better to use network volume
because I saw you answer another post saying network volume are slow
U can use other machines if the whole machine is rented ( your local disk volume )
Yes maybe abit slower, we'll you have to back it up to cloud storage if you want to be safe
That's another alternative
I mean if you deploy volume storage you are limited to single host machine and you can endup often with 0 GPU error 🙂
good one. I will create a local disk volume and create a gpu pod upon the disk volume.
Hmm
I'm confused
why
Nvm go ahead I don't need to understand your words when you're not asking haha
Creating gpu upon disk volume just not specific enough
sorry, I might be type too fast. my requirement is typical. I need a pod running ubuntu and has volume around 300GB (local disk / network volume).
All Gud was just curious
You've to experience the latency yourself and decide which is fine for you btw
previously I used network volume, but when I scp 200GB data from aws to runpod, it was too slow. so I am thinking the bottlenect might be the network volume. that is why I am considering create a new pod with local disk volume
Ohh
Howmuch speeds were you getting
only 5BM/s
means it takes 10 hour to copy data from aws to runpod
200GB data from AWS to RunPod must be a nightmare in terms of egress costs
Well they want data in not out, makes sense
ohhh, I though egress is free
Free 1gb 😂 idk Howmuch are the free tiers
AWS egress is very expensive. RunPod does not charge data transfer.
Fk aws
"Love aws"
ok, then I will just use the same network volume, and git all the source code and re-download data from runpod.
Re-download data? What for
so, the best practice in runpod is to create pod each time you use it with a network volume. and terminate it when you stop working right?
If you want to backup your data somewhere, I recommend using something like Hugging Face Hub which is completely free
Yep or just leave it on if it's running something and you want it to keep running
I backup all my models etc to Hugging Face Hub and then sync them to my pods
Yeah Howmuch the data limit on hf btw
Probably only a limit for private repos. I don't think public repos have a limit, TheBloke has a massive amount of data.
I use comfyUI which needs a lot of checkpoint models to run. I have downloaded to aws instance. that is why I'd like to copy the whole repo to runpod. now, seems I have to git clone the bare comfyUI repo to the new pod, and re-download all the needed models to the pod and the network volume attached.
I would use network storage for this
Options:
- use network storage
- bake models into docker image
- use volume storage
sounds better. how do you do so please? like how do you bring the aws instance data to hugging face hub and how do you sync from hugging face hub to runpod? thank you so much
still I found it strange to use network volume. if I terminate the pod every time. then all the ubuntu configure gone. I need to setup again each time I need to start the work, it sounds counter-intuitive.
Create a custom docker image and template so that you don't need to reinstall Ubuntu packages every time.
it still not so easy to use, because the config alway changing, I need to update the docker image from time to time. I will create a pod with local disk volume then. at least next time when I start the pod, all my data and the ubuntu config still there.
Env variables
If conditions
Those can be easier to change
thank you for the answer, but still it does work in my case, when you install a lot of python dependencies, or pytorch, cuda stuff. some of them are under /home/etc/... which is not /workspace. if I stop the pod, all of them get lost, I need to reinstall and re-do the env setup each time I stop the pod and start the pod again.
no ideal why the font changed to red
I am sure this is a basic and typical requirement, It must be me, using runpod in the wrong way.
You are using runpod in the wrong way. Don't put things that you want to persist on container disk. Container disk is temporary storage.
If you want your data to persist, then put it on the persistent storage in /workspace.
yes, I understand. but if you install pytorch, it goes to system folders like ~/etc/ or others, not the /workspace
Don't install pytorch etc into OS, create a venv on /workspace, activate venv and install the stuff there.
ok, do install anything in virtual env, not using system-wise installation. right?
got your idea. thank you for the help
Grand
last thing. back to the 200GB data transfer stuff. I will put the aws data to S3 and use runpod cloud sync, does it sound reasonable or not to you?
I have to admit the egrass cost incured, as all the downloaded stuff are there in aws instance, need to bring it out if I want to migrate to runpod anyway
Solution
I would use rclone rather than cloud sync. Cloud sync is built on stop of rclone anyway.
ok. thanks. great help. I will mark this as resolved. and the warm support really keep me in with runpod, and keep other options out. thank you