The actual storage space of the network volume is wrong.
Hi, I encountered a problem about the network storage.
I applied for 4TB of network storage and have used about 2.6TB so far. However, I noticed that on the pod management page, the displayed information for my pod indicates that the volume usage has reached 94%. When I attempted to write approximately 200 to 300GB more to reach 100%, I received a notification that I had reached my quota limit, which doesn't align with the 4TB of space I applied for.
Could you help me identify the issue? Thanks!
Solution:Jump to solution
Hi, your team has helped me with this problem. The reason is that I have a lot of small files
22 Replies
are you sure its not compressed files then extracted?
try to check them manually by command lines first while waiting for supports
Solution
Hi, your team has helped me with this problem. The reason is that I have a lot of small files
ah ye
its not my team actually, im not from runpod
@River Snow Similar issue with me. So, a lot of small files will caused a bugs on volume indicator?
I have 1274318 files in my dataset folder. but total size only 3.3 GB, smaller than my volume size
Just as a side note, sometimes what also happens is:
if deleted using jupyter notebook it can be hiding in the trash-0.
just sometimes i see sometimes it hiding in the /.Trash-0 directory b/c of deleting with jupyter labs notebook.
network volume storage uses at least 64kb space per file, if your file is smaller it still uses that much and if you have lots of small files, it will add up, this is not a bug but intended due to distributed storage chunks
/.Trash-0
is for container disk, its /workspace/.Trash-0
for volume disk.😮
WOW that is wild, i learnt a lot today - this is great to know
@flash-singh Thanks, But I have a lot of small file datasets to train on. Is there a way to handle that rather than increase the network storage?
maybe can move it to container storage?
and then zip it to move back to volume network storage?
Thanks for your hint, But it still has to be extracted for training purposes. And as my data grows, I have to increase the network storage limit several times which impacts costs.
no way around it unless you zip it to store in network storage like @justin suggested
Noted, thanks for your help
But you have two storage locations? One in Container , one in network storage.
So you can extract to Container, outside your /workspace, and then it won't be 64kb per file? And then anytime u need to update keep the one in network for zip
And u can just point ur code to the files outside of /workspace
Idk could be wrong just my thoughts - sounds like a network storage issue
Thanks, but the problem is that extraction takes more than hours on runpod. The files number in the millions. Have no idea why runpod is slower for extracting than in my local device (m1, 8B ram)
Oh interesting
Hm. Maybe you can tarball it instead of zipping?
That way u arent running it through a compression / decompression algorithm
Edit: prob wont work after reading
Ah just read ur post.. yes.. this is a lot of data
I've tried and it's the same. I think cpu processing in runpod is slower because the process is shared with other users even in secure cloud (CMIIW) 🙏
So, I only use the GPU on the runpod, because it is dedicated for my runpod. Other than that, I process it locally for any cpu processing and upload it to RunPod because of the time difference.
Makes sense
0 gpu gives very little cpu, if thats what your using, cpu only pods are around the corner
But I was using 1x3090 GPU
that should give you many more cores
with zip or tar, make sure to use multi core instead of single core
its much faster that way
Agreed, it should be like that, just a simple unzip. I have no idea why it was slower