Immich•2y ago

Typesense Keeps Restarting (CrashLoopBackoff)

I'm running the Immich stack in Kubernetes. I'm having an issue where the immich-typesense image appears to be failing and restarting (see image for logs). The notable lines I can see are raft_server.h:62] Peer refresh failed, error: Doing another configuration change and node.cpp:811] [default_group:10.42.1.104:8107:8108 ] Refusing concurrent configuration changing. This is a brand new install where I recently used the CLI tool to import my images from backup and just connected the mobile app. What could cause this image to see these failures and continue to restart? Kubernetes Manifest: https://github.com/ModestTG/heliod-cluster/blob/main/kubernetes/apps/media/immich/app/typesense/helmrelease.yaml On a separate but related note, I have noticed that the data used for the typesense image keeps growing and growing. Currently at 25 GB for ~80 GB of pics/videos. Not sure if that is normal or not but just wanted to see if that's also an issue that is happening. Any help anyone could provide would be very helpful. My cluster is running ubuntu 22.04 on Intel 7th gen

GitHub

heliod-cluster/kubernetes/apps/media/immich/app/typesense/helmrelea...

Test Cluster. Contribute to ModestTG/heliod-cluster development by creating an account on GitHub.

10 Replies

bo0tzz•2y ago

Try deleting the typesense container and its volume, then bringing typesense back up and finally restarting the server and microservices containers

Eazy EOP•2y ago

Thanks. I think I found the issue for me. The HelmRelease I'm using I got from someone else. The liveliness and readiness probes I think we're a little too agressive, so the pod would restart when the probes did not complete in time. Bumping up the probes seems to have fixed the issue. I'm not exactly sure if I should reduce them now that the big backlog has been cleared but I'll experiment with it in the future and see if that fixes my issue. Thanks for your response! Do you know if it's common for the Typesense container to use a lot of storage space? I just want to understand if that's normal or not.

bo0tzz•2y ago

I don't think that's normal. I saw in the kah discord that it went back down after doing a trim, right? I can imagine it doing a lot of io thus causing that, but it shouldn't be using a lot of disk space at any one point

Eazy EOP•2y ago

It wen't down but only a few GBs. Right now it's sitting at 28 GB being reported from Longhorn. But a df -ah from inside the immich-typesense container shows

/dev/longhorn/pvc-572a358c-2989-4161-83cf-a0af47d72ff5   50G  605M   49G   2% /config  #Columns go Size, Used, Free, Use%

/dev/longhorn/pvc-572a358c-2989-4161-83cf-a0af47d72ff5   50G  605M   49G   2% /config  #Columns go Size, Used, Free, Use%

So it seems like a potential longhorn reporting problem? Not sure why that is.

Eazy EOP•2y ago

Third row is my PV for Immich-typesense.

bo0tzz•2y ago

No idea what's going on there tbqh I would just shrink the pvc down to like 2 Gi and call it a day :P

Alex Tran•2y ago

Can you use typesense 0.24.0 version instead of 0.24.1 to see if it fix your issue?

Eazy EOP•2y ago

I'll give that a try. To be clear I don't have any issues anymore. It seems that Longhorn is potentially not correctly showing the used space for a given PVC. Immich appears to be functioning properly and everything seems to be working as intended.

FancyGUI•2y ago

FYI, longhorn is definitely showing the right usage for me. Getting into 75Gi right now and climbing. I'll try to delete and restart with the older version as well, but that's odd My setup is pretty much the same as yours @Eazy E

minch•2y ago

Hi Alex, I have the same issue with both versions of typesense ok, I think I have find something. I have deleted the typesense container and the tsdata volume before creating it again (using 0.24.0 version). And then in Immich, I run the "Tag objects" and "Encode Clip" jobs again now things seem to be working properly : no high IO, tsdata volume size is contained

Gaming

Programming

Typesense Keeps Restarting (CrashLoopBackoff)

Did you find this page helpful?