network storage sooo slow
Hi, I'm new to runpod. I'm running a 5xH100 in US-KS2 with network storage in the same region. Loading the model (70B Llama) from storage is going to take 25 minutes. Is this normal for runpod? This normally takes seconds on other machines I've used.
7 Replies
woow 5x h100?
isn't that overkill?
anyways for the performance in my experience i've been able to load llama2 70b in around 5 minutes~ or so ( longest time )
maybe you can report the pod if its too slow
5 mins is far more reasonable than 25. It's for training, hence the h100s. What do you train 70B with? Not sure runpod is going to fit the bill if the GPUs are sitting idle for 25mins on every run. Just wondered if runpod was having a bad day and I should give it a second chance.
One thing to try is to use the regular storage, and see if it makes a difference.
I had an issue before with the map() function on network storage and I was told here it is slower. I reported to the official support. So probably, you can do the same; hopefully, the issue get some attention.
Network storage is slow…I bake the models into the container now and it’s much much faster model loading
@briefPeach can you please let me know in which datacenter you used the network-storage?
EU-IS I think
Is it faster in other areas
Was there a point in time when you didn't had this probem in EU-IS?