DreamGen
RRunPod
•Created by DreamGen on 5/24/2024 in #⛅|pods-clusters
Network issue ETA?
Several of my podst got hit with
This server has recently suffered a network outage and may have spotty network connectivity. We aim to restore connectivity soon, but you may have connection issues until it is resolved. You will not be charged during any network downtime.
including e.g. 82mr3meakiiytt
Do you have ETA for the fix? They are still not back up.5 replies
RRunPod
•Created by DreamGen on 5/19/2024 in #⛅|pods-clusters
Feature Request: `runpodctl send` TO specific machine & folder (ala SCP)
This can be achieved today by running:
But it would be great to just be able to do:
17 replies
RRunPod
•Created by DreamGen on 4/20/2024 in #⛅|pods-clusters
4xH100 pod is stuck -- can't restart or stop

6 replies
RRunPod
•Created by DreamGen on 4/17/2024 in #⛅|pods-clusters
A6000 price change based on # GPUS?
Steps to reproduce:
1. Go to community cloud
2. Select A6000 (price 0.69/hr)
3. Change count to 2 (price 1.58/hr -- which is 0.79/hr per gpu!)
4. Change count back to 1 (price stays 0.79/hr)
So two questions:
1. Since when do you increase price when you rent 2 GPUs?
2. Why does the price stay 0.79/hr after reducing count from 2 to 1?
3 replies
RRunPod
•Created by DreamGen on 3/16/2024 in #⛅|pods-clusters
UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda
This is a reocurring problem on RunPod.
This time with 3090 -- tried 3 different pods in CA region (can't use US region because it has maintenance soon...).
ID: wmwxn9onlckqus
5 replies
RRunPod
•Created by DreamGen on 2/25/2024 in #⛅|pods-clusters
Broken CUDA / PyTorch on H100
Tried reinstalling PyTorch, did not help.
26 replies
RRunPod
•Created by DreamGen on 2/15/2024 in #⛅|pods-clusters
Reserving pods on different machines
Hey there, 4 of my long running pods have a scheduled maintenance at the same time. I would like to spin up new pods before then to cover for that, but how can I make sure the new pods won't be on the same machine and also undergo maintenance before starting them?
3 replies
RRunPod
•Created by DreamGen on 2/4/2024 in #⛅|pods-clusters
Any recent firewall changes?
Were there any recent firewall changes in the last few days? Seeing
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
when interacting with HF hub. Replicated by other people as well, on different machines.4 replies