Typesense Keeps Restarting (CrashLoopBackoff)
I'm running the Immich stack in Kubernetes. I'm having an issue where the immich-typesense image appears to be failing and restarting (see image for logs). The notable lines I can see are
raft_server.h:62] Peer refresh failed, error: Doing another configuration change
and node.cpp:811] [default_group:10.42.1.104:8107:8108 ] Refusing concurrent configuration changing
. This is a brand new install where I recently used the CLI tool to import my images from backup and just connected the mobile app. What could cause this image to see these failures and continue to restart?
Kubernetes Manifest: https://github.com/ModestTG/heliod-cluster/blob/main/kubernetes/apps/media/immich/app/typesense/helmrelease.yaml
On a separate but related note, I have noticed that the data used for the typesense image keeps growing and growing. Currently at 25 GB for ~80 GB of pics/videos. Not sure if that is normal or not but just wanted to see if that's also an issue that is happening.
Any help anyone could provide would be very helpful. My cluster is running ubuntu 22.04 on Intel 7th genGitHub
heliod-cluster/kubernetes/apps/media/immich/app/typesense/helmrelea...
Test Cluster. Contribute to ModestTG/heliod-cluster development by creating an account on GitHub.

10 Replies
Try deleting the typesense container and its volume, then bringing typesense back up and finally restarting the server and microservices containers
Thanks. I think I found the issue for me. The HelmRelease I'm using I got from someone else. The liveliness and readiness probes I think we're a little too agressive, so the pod would restart when the probes did not complete in time. Bumping up the probes seems to have fixed the issue. I'm not exactly sure if I should reduce them now that the big backlog has been cleared but I'll experiment with it in the future and see if that fixes my issue. Thanks for your response!
Do you know if it's common for the Typesense container to use a lot of storage space? I just want to understand if that's normal or not.
I don't think that's normal. I saw in the kah discord that it went back down after doing a trim, right? I can imagine it doing a lot of io thus causing that, but it shouldn't be using a lot of disk space at any one point
It wen't down but only a few GBs. Right now it's sitting at 28 GB being reported from Longhorn. But a
df -ah
from inside the immich-typesense container shows
So it seems like a potential longhorn reporting problem? Not sure why that is.Third row is my PV for Immich-typesense.

No idea what's going on there
tbqh I would just shrink the pvc down to like 2 Gi and call it a day :P
Can you use typesense 0.24.0 version instead of 0.24.1 to see if it fix your issue?
I'll give that a try. To be clear I don't have any issues anymore. It seems that Longhorn is potentially not correctly showing the used space for a given PVC.
Immich appears to be functioning properly and everything seems to be working as intended.
FYI, longhorn is definitely showing the right usage for me. Getting into 75Gi right now and climbing. I'll try to delete and restart with the older version as well, but that's odd
My setup is pretty much the same as yours @Eazy E
Hi Alex, I have the same issue with both versions of typesense
ok, I think I have find something. I have deleted the typesense container and the tsdata volume before creating it again (using 0.24.0 version). And then in Immich, I run the "Tag objects" and "Encode Clip" jobs again
now things seem to be working properly : no high IO, tsdata volume size is contained