Machine Learning in bootloop with no additional errors
Since the 1.51 update, I tried to run the CLIP job and now my machine-learning container is crashing with no ostensible error. I've set NODE_ENV to "development" and this is the entire output before a restart. There's no crash log or anything else to go on at this point. Logs are in txt as they are too long for discord
5 Replies
Can you post your docker-compose and env file?
Also log from microservices and server would be helpful
The log you uploaded showed that it was trying to download the machine learning model, which is expected behavior
Running on kubernetes. I saw that I was using the old command and arguments. It still crashes but with added
I'm not sure, but I would guess machine learning is out of memory and needs more than what you have allocated. Can you try running it with more? Specifically, the CLIP model is like 800MB by itself, which is new in 1.51.
That container is allowed up to 2000MB
Can you monitor the memory usage while it bootloops or try increasing it to see if that fixes the issue?