lewington
reproducible: pods crash 50% of the time
i am trying to build an API which allows people without big GPUs to run googles weather forcasting model graphcast
I have the code
about 20% of the time this will work perfectly well
The other 80% we get in the pod logs, and it just keeps cycling like that
Notably when I switch image_name to runpod/stack, it works 100% of the time this is very confusing to me
The other 80% we get in the pod logs, and it just keeps cycling like that
Notably when I switch image_name to runpod/stack, it works 100% of the time this is very confusing to me
5 replies