socket.gaierror: [Errno -3] Temporary failure in name resolution
Just had this issue when trying to upload trained checkpoints to CloudFlare R2.
Used to work before but all of a sudden this error showed up and caused the request to R2 to fail.
What can be the cause of this? I’m using the built in RunPod Python helper function to upload to S3 and it used to work fine
5 Replies
This error seems indicates that the DNS resolver is temporarily unable to resolve the hostname to an IP address. 😮 Bad name? or maybe network glitches
I also thought that the file name might have problems but it used to work before without any changes so I doubt it
Maybe network issues but then the machine has problems maybe
Trained for a few hours and then I am unable to get my checkpoints because of some network issue with uploading it to S3 🥲
Ah..that sucks, sorry to hear that.😂
Please try reporting this via the website
With the error log and endpoint id
Will do thanks!
Just created a ticket, I hope someone can look at it asap, I'd appreciate it