Very low download speed. Will take days to download the model

300 Kb/S IS NOT FAIR!!!
No description
No description
45 Replies
arhanovich
arhanovichOP10mo ago
This platfrom is not usable any more with these low download speeds
Justin Merrell
Justin Merrell10mo ago
Is this secure cloud or community cloud? And what is the pod id?
xxeviouxx
xxeviouxx9mo ago
yeah i have seen this all day today and yesterday
Benjamin
Benjamin9mo ago
same here. I always need to restart my down- or uploads as the speed is always declining very fast after a few minutes to 600-800 KB/S in community cloud, e.g. ut51facl5q9ned
H4RV3YD3NT
H4RV3YD3NT9mo ago
I've noticed this a lot recently as well. I have fiber internet, speedtest-cli will show a decent speed of a few hundred Mb or so on the pod, I'll get runpodctl up and running and... it's a crawl of maybe 8 Mb/s if I'm lucky. More insulting, often the first chunk I upload of my data is pretty quick, and then it dips. But for 300 GB of data, I can't take the cost of leaving the instance running and doing nothing like I tried last week, so no pod id to share unfortunately. I considered using a network volume but there are never, ever GPUs available in Secure Cloud so 🤷‍♂️
justin
justin9mo ago
@xxeviouxx / @bennyx0x0x / @H4RV3YD3NT / @arhanovich : https://discord.com/channels/912829806415085598/1207175605255147571 If you do happen to see it again, i'm not a staff, but I was also curious about this issue, and after doing some research wrote this little script to help do speed-test sanity checking + also dump relevant pod information into a text file you can share 🙂 if it happens again, feel free to make a post + share the text file, as I think that would be helpful to the runpod staff to actually help know if it is runpod's end or maybe your downstream source etc.
arhanovich
arhanovichOP9mo ago
issue still continues I will cancel my subscription If no staff is interested
Justin Merrell
Justin Merrell9mo ago
@arhanovich I did not see a response to this earlier.
arhanovich
arhanovichOP9mo ago
Secure Cloud
Justin Merrell
Justin Merrell9mo ago
And the pod id?
arhanovich
arhanovichOP9mo ago
I have deleted 3 of them and opened a new one with ID: e4r5seqn1xadhn
justin
justin9mo ago
(Just wondering) can you share more info about where you are downloading from too? 300kb is also crazy slow, but be helpful to know what is it downloading from. Like a command etc.
arhanovich
arhanovichOP9mo ago
70B model from huggingface I tried 4 pods today
justin
justin9mo ago
Cool~ hopefully the staff can help u check that out. But also just wondering did u run the speed test i posted? Got any results to share?
arhanovich
arhanovichOP9mo ago
the max is 25 MB/s which takes many hours to download that model
Justin Merrell
Justin Merrell9mo ago
Try using this package as well https://github.com/huggingface/hf_transfer
justin
justin9mo ago
Just FYI - if you end up running this speed test, ull prob also get a better indication from the text file that gets generated if the pod is slow, or if that is hugging face's limitation. Could be on hugging face end, if you went through 4 pods, i imagine that their side is just slow.
curl -s https://raw.githubusercontent.com/justinwlin/Runpod-Tips-and-Tricks/main/SpeedTest/speedtest.sh -o speedtest.sh && chmod +x speedtest.sh && ./speedtest.sh
curl -s https://raw.githubusercontent.com/justinwlin/Runpod-Tips-and-Tricks/main/SpeedTest/speedtest.sh -o speedtest.sh && chmod +x speedtest.sh && ./speedtest.sh
https://huggingface.co/docs/huggingface_hub/en/guides/download
arhanovich
arhanovichOP9mo ago
There are no longer any instances available with enough disk space. (I only want 300GB) performance of runpod woefully decreased a lot You are about to lose a lot of customers until you fix it I remeber the old good days 😦
justin
justin9mo ago
u could prob launch a network storage with 300 gb, which is prob what u would like anyways is my guess if ur downloading a model so big / prob want to persist it and ive seen this before, but i think it usually becomes avaliable soon But if u launch an instance with network storage attached, u wouldn't need to launch a single pod with just that much storage + prob be more flexible is my guess
arhanovich
arhanovichOP9mo ago
Any runpod good alternatives It sucks
ashleyk
ashleyk9mo ago
@arhanovich what is the issue you are having? RunPod is awesome in my experience.
arhanovich
arhanovichOP9mo ago
Man its download speed is very very low and you do not get storage Easily What do you think I will run If I do not get enough storage?
ashleyk
ashleyk9mo ago
What kind of storage are you referring to? Regular persistent storage or network volume storage?
arhanovich
arhanovichOP9mo ago
Runpod used to be very very good But it sucks now
ashleyk
ashleyk9mo ago
Its still as good as ever for me
arhanovich
arhanovichOP9mo ago
What do you run on runpod? I have wasted 10 dollars today to find a single A100 today with download speed of 30 MB/s
justin
justin9mo ago
Did u run the speed test? Honestly this seems more like a hugging face problem Unless u run the speed test showing that it is a pod problem itself What that means is that no matter what computer u use, u could be getting bottlenecked by HF, unless the speed test is showing u that ur download speeds just suck across the board But yea vast.ai / tons of other alternatives > but so far I think runpod is still the easiest U could always go to AWS / GCP if u want a big cloud provider / figure that ecosystem out
ashleyk
ashleyk9mo ago
vast.ai sucks, I loaded $10 and never used it because I hate it
justin
justin9mo ago
Also sorry just FYI side note, if you are just downloading a model, they released CPU Pods, so that you can just download the models directly to a Network storage and then mount the more expensive H100 GPUs CPU Pods are a new things, so that u arent just burning money downloading - i guess this would limit u in terms of region tho
ashleyk
ashleyk9mo ago
CPU pods are only available in RO and there are no H100's available in RO
justin
justin9mo ago
oh no D: okok dang I guess any lower GPUs in the region of h100s then would also suffice, cause a model so big would take a while and burning that on an h100 for network request seems rough
ashleyk
ashleyk9mo ago
Yeah I always use the cheapest available GPU to copy stuff onto my network storage
flash-singh
flash-singh9mo ago
couple things to always try when you get low speeds - run speedtest - run ping test ping google.com packet loss can be a big issue otherwise we are here to help but reporting low speeds downloading from huggingface is not enough info to go on @nathaniel another good use case to add to runpodctl for easy speedtest runs
arhanovich
arhanovichOP9mo ago
No description
arhanovich
arhanovichOP9mo ago
I am thinking of Using that option form huggingface service Does it mean that 13 USD per hour like only the 1 hour of interference or like until you end the deployment like Run Pod? Does anyone have any idea?
justin
justin9mo ago
it probably till u end it I havent heard of a service that just kills u after an hour They want ur money probably haha
ashleyk
ashleyk9mo ago
Yeah AWS also charges while its running until you stop it
arhanovich
arhanovichOP9mo ago
Very bad I agree that runpod has cost advantage But problematic when dealşng with large models
justin
justin9mo ago
Yeah, but I havent found a better alternative yet haha. If you do let me know xD It seems just inherently a hard problem why a lot of competitors are around in the space trying to solve this right now Just the network bandwidth alone to move ML models are on avg stupidly large + then also the storage + training and deployment of them is really just new. Talked about this with my friends actually, but the reality is where do I move to if not runpod - cause everyone else is harder to use / screws me over in pricing so far
arhanovich
arhanovichOP9mo ago
No description
arhanovich
arhanovichOP9mo ago
I guess I am on a luck pod lol 252 MB/s Dropped to 8 MB/s WOW PRETTY FAIR I have started thinking that this is intentionally done by Runpod to make you pay more usage coz no way
justin
justin9mo ago
Run the speed test? are u using the HF CLI tool?
curl -s https://raw.githubusercontent.com/justinwlin/Runpod-Tips-and-Tricks/main/SpeedTest/speedtest.sh -o speedtest.sh && chmod +x speedtest.sh && ./speedtest.sh
curl -s https://raw.githubusercontent.com/justinwlin/Runpod-Tips-and-Tricks/main/SpeedTest/speedtest.sh -o speedtest.sh && chmod +x speedtest.sh && ./speedtest.sh
Again, if you run the speed test, and it is consistently an issue, then its a problem for runpod if not, it means that hugging face themselves is bottle necking u Which means to use their CLI tool or something WHy dont u just share the command ur running then or tell us more infor? or try to run the speed test and see if its actually the pod Like u complain its runpod, but never try any the suggestions so far
arhanovich
arhanovichOP9mo ago
Speed tests do not consider large files
justin
justin9mo ago
I test against a 5GB out of a 200GB file on S3, if u want u can change that to be 20gb, 50GB etc and i have a max timeout on it, so that we can get the avg speed and not just wait there sitting for the whole file to download. It also auto cleans up and just sends it to null so isnt taking ur space on ur thing. Just change the s3 download range, But if for ex. the S3 is fast, it means Hugging face is limiting u which then means follow using their CLI, which are u? etc. my speed test runs against a civitai ai file download (2GB), a HF LLM model (5GB), and a S3 download file (5Gb / 200GB file, which u can change i fu want) so it isnt just a normal speed test + i am testing larger files That why im telling u to run the speed test I gave u, b/c if it is an issue, across the board its runpod's fault, but if not, it means ur limited by hugging face, and ull hit this no matter what and also are u using the HF CLI tool? can u share ur ocmmand The speed test I gave u is to 1) run pings 2) run checks against normal speed tests 3) to sanity check across multiple sources of files for downloading large files. Honestly, u keep saying its runpod without even trying to prove it when i literally give u the script to prove it
ashleyk
ashleyk9mo ago
And it seems to be a Huggingface issue more than a RunPod issue I've seen a lot of people complaining about speed since Huggingface was down the other day, not sure whether its coincidence or not.
Want results from more Discord servers?
Add your server