Immich•3y ago

Machine Learning in bootloop with no additional errors

Since the 1.51 update, I tried to run the CLIP job and now my machine-learning container is crashing with no ostensible error. I've set NODE_ENV to "development" and this is the entire output before a restart. There's no crash log or anything else to go on at this point. Logs are in txt as they are too long for discord

message.txt

5 Replies

Alex Tran•3y ago

Can you post your docker-compose and env file? Also log from microservices and server would be helpful The log you uploaded showed that it was trying to download the machine learning model, which is expected behavior

justjonOP•3y ago

Running on kubernetes. I saw that I was using the old command and arguments. It still crashes but with added

[2023-03-22 01:57:41 +0000] [1] [WARNING] Worker with pid 37 was terminated due to signal 9

[2023-03-22 01:57:41 +0000] [1] [WARNING] Worker with pid 37 was terminated due to signal 9

helmrelease.yaml

configmap.yaml

logs-from-immich-mic...

logs-from-immich-ser...

jrasm91•3y ago

I'm not sure, but I would guess machine learning is out of memory and needs more than what you have allocated. Can you try running it with more? Specifically, the CLIP model is like 800MB by itself, which is new in 1.51.

justjonOP•3y ago

That container is allowed up to 2000MB

jrasm91•3y ago

Can you monitor the memory usage while it bootloops or try increasing it to see if that fixes the issue?

Gaming

Programming

Machine Learning in bootloop with no additional errors

Did you find this page helpful?