GENGHIS
GENGHIS
RRunPod
Created by GENGHIS on 6/10/2024 in #⛅|pods
Networking on my pod has been shit for last 3 days. please fix. US region. RTX 6000 Ada
Going to try transfering my data to a new pod. Would be great if you could fix the networking. Keep losing connection.
9 replies
RRunPod
Created by GENGHIS on 5/20/2024 in #⛅|pods
RTX 6000 Ada performance much worse than expected
From the NVidia specs, I would expect its performance to be on order of 10 - 20% slower than L40S. However, in my current training, I am finding it closer to 2X slower or worse. FP16 mixed precision training. Pretty bad considering price. Perhaps there is some other issue in how the pods or nodes are set up that could be worth looking into?
4 replies
RRunPod
Created by GENGHIS on 5/17/2024 in #⛅|pods
Better solution for 0 GPU stranded volumes
Since on-demand GPUs can get taken, would be great to have some better escape valves for getting our data off the volume. Right now, the 0.5 vcpu 512 MB RAM pod you give keeps killing my upload task. I would happily pay for more resources to speed up getting my data out. Would be nice to be able to attach a network volume to a pod after creation as well, or if you had cross-region network volumes. Network volume that only works in same region is of limited value, because a big reason for moving data around is that there's no GPUs in the region!
21 replies
RRunPod
Created by GENGHIS on 2/21/2024 in #⛅|pods
`runpodctl stop pod $RUNPOD_POD_ID` failing with 401
I used to end my long running jobs with this command. has failed last several times with 401. runpodctl stop pod $RUNPOD_POD_ID Error: statuscode 401
1 replies