Machine Learning in bootloop with no additional errors

Since the 1.51 update, I tried to run the CLIP job and now my machine-learning container is crashing with no ostensible error. I've set NODE_ENV to "development" and this is the entire output before a restart. There's no crash log or anything else to go on at this point. Logs are in txt as they are too long for discord
5 Replies
Alex Tran
Alex Tran3y ago
Can you post your docker-compose and env file? Also log from microservices and server would be helpful The log you uploaded showed that it was trying to download the machine learning model, which is expected behavior
justjon
justjonOP3y ago
Running on kubernetes. I saw that I was using the old command and arguments. It still crashes but with added
[2023-03-22 01:57:41 +0000] [1] [WARNING] Worker with pid 37 was terminated due to signal 9
[2023-03-22 01:57:41 +0000] [1] [WARNING] Worker with pid 37 was terminated due to signal 9
jrasm91
jrasm913y ago
I'm not sure, but I would guess machine learning is out of memory and needs more than what you have allocated. Can you try running it with more? Specifically, the CLIP model is like 800MB by itself, which is new in 1.51.
justjon
justjonOP3y ago
That container is allowed up to 2000MB
jrasm91
jrasm913y ago
Can you monitor the memory usage while it bootloops or try increasing it to see if that fixes the issue?

Did you find this page helpful?