OSError: [Errno 5] Input/output error
The model training stopped in the middle of the night with I/O error, apparently it is due to physical disk problem, and i tested it is randomly occur. the consequence is that make my pod idling for at least 6 hours, and i paid for it.
1. How to stop it happen again?
2. Can i claim it back for those idling hours?
I can provide you the log and my pod number
12 Replies
hi I am facing the same issue on two different pods accessing the same network storage. Did you found a solution or the reason? was it a physical disk issue? thanks!
Maybe contact support, they can check if there is an issue with it
yes provide your pod id here
btw does your network-storage has enough free space?
it was at 77% with around 70GB free. I terminated both pods and extended the network storage for another 100GB more. For now the error did not happen again. But I don't understand why that helped (if it actually did, they are still running)
Have you reported this in contact button in website? ( support requests )
Yes, I just did
Nice
Thank you so much!
Same thing repeatedly happens to me every few days.
Make a support ticket providing your pod ID, support can check the server if anything is wrong
is there a general reason as to why this happens? I output all logs to an output.log file
which can be a bad idea
?
Not sure what the support said for this
But I found this https://stackoverflow.com/questions/52376942/python-ioerror-errno-5-input-output-error
Stack Overflow
Python IOError: [Errno 5] Input/output error?
I m running on a remote server a python script using nohup.
First I connected to the remote machine using a VPN and SSH
Second I run a python script using the following command: nohup python
Second I run a python script using the following command: nohup python