Large job queue doesn't seem go go down
See attached image - this has been sat like this for several hours and the count doesn't seem to be going down

141 Replies
I've done a compose down and up but it doesn't seem like it gets going
Make sure all the containers are running
The entire stack is running
Will double check the uptime
Huh
Looks like the micro services container is crash looping
hmm the plot thickens
i see this in the microservices logs
looks like there's an open issue for this too
GitHub
[BUG]: Corrupted Reverse Geocoding CSV File ยท Issue #1963 ยท immich...
tomayac/local-reverse-geocoder#63 Possible solutions: Fix upstream Detect the issue, delete the csv file/directory, and try again
Deleting and recreating the container should fix that
You are indeed correct
I did a compose down and up then paused all jobs apart from the metadata job
gonna let just that one run for now
Ok another update
The extract metadata job runs for a long while because there's so many photos and videos - fair enough
but it eventually hits a point where it just stops and the microservices logs say it encountered an error
but if 1 file can't be processed it should just add it to a "bad" folder
and then move on
try my workaround. had it as RO for the CSV file for over a month and holding:
https://discord.com/channels/979116623879368755/1090804351213244566
which also says a lot - i hardly travel lol. so no new cities/geocodes
@ahbeng what's your workaround exactly?
He added a volume mount to the .reverse-geocoding folder in /usr/src/app with the options of read-only.
eg
Ooh OK I'll give that a shot and see how it goes
OK I'm just seeing some warnings in the logs now which look fine
Will leave this to run and see how it goes
Hmm, even after hours and a stack restart I'm still at 30k photos and videos to process
It's just not going through them
Did you look at the immich-microservices logs again?
sorry for the horribly slow responses
i did, this is what i'm seeing
Looks like that read only volume doesn't work. That looks like a related error
If it has that CSV error again, you can instead just run
docker compose up -d --force-recreate immich-microservices
i've just realised it was trying to mount a folder to
.reverse-geocoding-dump
i've composed down, deleted the folder, created a blank .reverse-geocoding-dump
and composed up
let's see how that goesok now seeing this
Looks similar.
i think it's working
container is doing something

I think reverse geocoding is failing though.

that should be fine though right?
it should skip over it?
If you don't want that you can just disable it
what does reverse geocoding do exactly?
is that just for location?
I think it's cool and adds the city name for photos with geo data
i don't think location data is in my pictures
Photos have gps cords but no city or state names
and i've had so much trouble with this so i'd rather turn it off lol
Lol sounds good.
is there a command to disable it?
There is an env for it I think
You can always run it again on the future.
hmm so it's listed as a feature
checking docs for it
found it
ok i've disabled it, will leave this to run overnight and see how it goes
smh i should read the docs more lol
Lol. That would be another good thing to make configurable via the admin settings though.
yeah it would to be honest - if i saw it there i would have disabled it right away lol
Lol. There are several settings that can probably be migrated from env to admin settings. Maybe I'll tackle that in the future lol
that would be awesome
although i will argue that env vars are better in some ways
easier to replicate and script
Yeah, for infrastructure set up and stuff they are for sure, but there is a tradeoff between maintaining both vs only offering one or the other. In immich we've opted for preferring admin and NO env if it can be avoided. I don't think that many people are automatically configuring their instance.
i think you're right
So it is a good compromise
making this approachable to regular people is an excellent goal
Good usability with little maintenance overhead has worked well so far.
ok this is now running brilliantly
logs look good
extract metadata job looks spot on too
thanks again for the help ๐
one final question - how often do the jobs on the jobs page run? once every 24 hours?
So the way it works is that on asset upload success it triggers the thumbnail job and exif extraction job, etc.
So normally you do not need to manually run them ever, they're run as needed for new assets.
ah ha, i thought so
In the case we change something or you update a setting you man want to manually run them for missing or all, which you can do from that page
but in my case where i've done a bulk upload that went wobbly
probably best i run them all manually once
GitHub
Does the server run the jobs on a schedule? ยท immich-app immich ยท D...
Hello all, I searched, but couldn't find any information about whether the server jobs (e.g. generate thumbmails) have to be triggered manually or if the server runs them at some intervals? Tha...
i found this too
Yeah, and sometimes it's easier to turn off (pause) the machine learning jobs which can be quite cpu intensive while bulk uploading is happening
yeah they were sending my CPU usage to 800% lol
I think that answer is outdated, we've fixed some stuff so they really shouldn't need to be manually run
hammering it so hard that my zabbix alerted on my UPS runtime being 7 minutes lower than normal ๐คฃ
The only scheduled job that runs on a nightly basis is the user delete one now
gotcha
Oh dang lol.
i capped the CPUs to 1 for each container
just to stop it running away with CPU
means everything will take way longer, but the server is on 24/7 so idc
I like the idea of disabling ML, bulk uploading everything, checking the exif, data, thumbnails, etc all look good, then if you're happy and don't need to fix stuff, run all the ML stuff after that
i think that's what i'll do
when you do a CLI bulk upload, does a job kick off for each image that's uploaded?
Yes
i imagine so since it's probably a similar ingest to what the app uploads do, right?
What happens is that you always find something that needs to be fixed and sometimes it is easier to just start over with clean data, so it's a bit useless to let the machine learning stuff from until you're pretty convinced you won't be starting over ๐
Yeah, queues are really light, they're just redis records
thumbnail generation is usually really fast too, same with exif extraction. It uses exiftool, which is a pretty awesome library
Yeah, it's the same API - upload asset gets saved to disk and then kicks off exif and thumbnail generation.
im hoping that once the processing on my images is done, i should be in a good working state
You can pause the queues from the admin console during upload and unpause them when the cli finishes. I think it's usually fine to leave exif and thumbnail running. You can refresh the webpage to see the assets as it's happening if you wanted to.
That's the dream ๐ค
gotcha
if this turns out well, i'll extract and import my gf's photos
Nothing shows up in the timeline until it has a thumbnail.
that's another 100GB lol
ahh
That'd be great. I have a queue of family member's I'm putting off doing lol
I have another item on my todo list which is to show a default loading thumbnail of the asset as soon as it's uploaded and then do a fancy flip-around animation when the thumbnail has been generated. Kind of like Plex if you've ever used that.
ooh that would be really cool
There are quite a few people who ask about not being able to see this or that image and it's often because the thumbnail is missing or because it was a RAW format and we couldn't make one, but the server just filters those out lol
Too many things to build and not enough time to build them!
gotta prioritise sadly, although i can say for sure that the fine details are noticed
Yeah, luckily lots of stuff is getting done, so hopefully it doesn't take to long to work through everything
well i appreciate the hard work - this project is just another reason why open source and self hosting is so cool and so important
and with that im gonna head to bed lol
laters ๐
hey, me again, i left it running overnight but sadly im still seeing a ton of photos from the day i did the import (5th May)

that's the sidebar for example

and job status
so the metadata and thumbnail jobs finished fine
Yup, looks like machine learning is probably running very slowly
It is quite cpu intensive
it is, i capped the CPU usage on it
will that be why images aren't showing up with the correct date?
this is the microservices container config
That should not impact dates at all
yeah that's what i thought
i'll dig out a few images and check their metadata
Can you clear cache and try again?
Or try an incognito window?
cache as in browser cache?
it looks the same in the app and in chrome on my phone
just tried incognito - same thing
what's odd as well is the image has the date stamp in the filename lol

hmm ok the plot thickens
so i did a google takeout of just my google photos and asked it to do the largest it could
so in my case it was 2 tar.gzip archives, 1 was 50Gb and the other was 25GB
i extracted both .tar files and then extracted everything from them
so i have all the data from google along with the json files
this is where it gets interesting
i have an example file called
2022-03-29--23-03-42.png
this is just a screenshot i took
file name contains the complete date, so that's something
but the file on my server has the date created as the day it was extracted
so it thinks it was made on 05 May
but if i check the accompanying .json file
that has the correct metadata
so to circle around, google have fucked me and not properly embedded the date into the file
or
they took the day i uploaded it as the creation date and wrote that into the json next to the file
so to fix this i just need to do a big exif scan job that'll read the file and json and add the missing metadata in the image from the json
i suppose the only problem i'll have now is once i complete this scan job and then do a CLI re-import, it's gonna skip the files because they already exist
so looks like i need to do a fresh import lolAnd if you update the file it'll be a different hash and will result in duplicates
makes sense
I think there is a thread about this topic that you should read through
i assume it'll have the exiftool command handy haha
speaking of which - do you know which container has exiftool in it?
GitHub
gphotos import + albums migration ยท immich-app immich ยท Discussion ...
gphotos import + albums With this guide, you STILL have to add images that are in multiple albums manually (except for the first instance of that image, so hopefully this isnโt too many) can get al...
The server and microservices both do
ahh that thread looks like just what i need
will give it a shot - thank you
No problem
Def do that before running all the ML stuff lol
what's my best bet for cleaning out uploaded images? i can just start over if that's easiest
Easiest is just start over honestly
will do, i'll document what works for me in the end lol
ah, think i found something that may work
GitHub
google-photos-takeout-scripts/README.md at main ยท m1rkwood/google-p...
Useful scripts to get out of Google Photos. Contribute to m1rkwood/google-photos-takeout-scripts development by creating an account on GitHub.
mount the path to the original images from the google takeout to the microservices container
then run this
i have 52k files to run this on lol
it'll be a while
Nice

alright, it's something
Progress
now to wait

OK I left it running overnight and the upload and import is done
It's definitely better than it was, but there's still about 3000ish files with the wrong dates on
Which lines up with what exiftool says
Strange thing is most of these are screenshots in .png format and they all have a json file with them
So I don't see why it failed to write the metadata
God damn this object detection is really good
So of my 52k files, about 3k are wonky
Mix of some with the date of last night and some say 1st Jan 1970 lol
Looks like it's primarily videos from WhatsApp that have the wrong date on them
But the date is in the file name - must be a way I can process these
Hmm maybe I'll run this again and try this
Take original images and run exiftool to get metadata from the json
Any that failed, move them to another folder
Then try them again and see if it works
If not, try another command to extract the date from the file name and then import
@jrasm91 do you know if the following would work
On immich server, cd to the library folder that says 1970 and then fix the metadata on the images in there
Then run the extract metadata job again in the UI
This will probably change the file hash so I assume postgres will think that file is gone
That would technically work but the hash in the database would be wrong compared to the file on disk. Cleaner would be to delete all the files and re-upload them after they are fixed
All the files with the wrong dates that is
That makes sense
OK good news is immich has the wonky files either in the 1970 folder or over the last few days
So I can pull them from there and I can get all of the json files easily
I'll give them another run with exiftool to see if they work with the json this time
Failing that I'll have to try date extraction from the file name
That should then leave me with randomly named files which should be fixable with a json import
Only oddity is why are some of them failing
Maybe I'll need to ffmpeg the dodgy files
It's amazing how you can fix a weird video or image with ffmpeg lol
Yeah lol
Hmm so I had some time last night to try this
I moved all of the 1970 image files to a new folder and then deleted them all from the UI
Then I tried exiftool again with the json - says it succeeded on them all so I ran the import and they're still 1970
All of these files are short MP4 videos from WhatsApp
I tried the same again except using the date from the file name - still the same problem, videos are all showing up as 1970
Must be something going wrong with the metadata bring written or read
I did a quick fix on my files with date from 1970 in the assets table, ran a SQL script to replace the filecreatedAt to the fileModifiedAt date.
This since the fileModifiedAt had the correct date in it from when the photos were taken.
ooh that could work
i'll give that a shot
@Akimbo i've tried reading through this ticket but having trouble working out what worked and what didn't . Mine also got stuck and wouldn't budge. I just ran and that got the count moving down again, however, I would like to make sure it keeps working without having to use commands. do you have any recommendations of changes I can make to stop the command being needed?
this seems to be a repeating error/message in the logs
Yeah you have the same problem I did
In your env file you need to disable reverse geocoding
Check the immich docs for the example env file - that has the line in there to disable it
That was only half of my problem sadly
Thanks, Iโve disabled it and seems to be still going now. A bit slow but thatโs ok
@Allram how do i run that in pgsql?
i'm in the sql container and i have the immich db selected
but i get this
Not in front of my PC now, but i ran it through cloudbeaver as a SQL script
gotcha, i was having a skill issue so i'll try again
i fucked the command and it changed the upload date on all of my assets to today ๐คฃ
only thing that's odd is there's these 1970 files and then a load with the upload date that i just can't seem to get to use the creation date specified in the json
I hope you had a backup ๐
Try to re-run the metadata job, then it should pick it up? ๐ค
i did, managed to recover
it's annoying, i've just got images with the wrong date from 2036 and 17th May 2023 and 1970
if i can fix those, that should be it
hmm ok im getting somewhere
this command with exiftool seems to import metadata correctly for my files that are wonky
this is being ran from inside the microservices container
immich is reading the wrong metadata tag for this file for example

it's using "create date" which is very wrong
no idea how it's managed to become 2036
that's the full exiftool output of the file for reference
the video is an odd one - it's a video of my parent's dog that my mum took on her iphone and sent via whatsapp
maybe whatsapp borked the timestamp somehow
OK, so to recap I've done the following
- Created Immich from the compose stack
- There are no images or videos uploaded
- I don't intend to upload anything into Immich
- My plan is to keep using Google Photos and just export and import into Immich when storage gets low
- Then I did a Google Photos export of everything
- I have extracted the entire export into 1 folder called "Takeout" which has about 50k assets
- I initially just imported all 50k assets into Immich, but about 30k of them had the wrong date on them
- So I ran a different version of the exiftool command which fixed about 27k of them
- This just leaves the problematic 3k files
- I know these are problematic because they have a date of 1970, 2036 or 17th May (export is from 5th May so definitely not possible)
- Now I'm running another version of the exiftool command to correct the date on these final few images
- Once this is done I'll delete the problematic assets from the web UI and then do a CLI import
Once all this is done, I should effectively have a 1-1 mirror of my Google Photos but in Immich
alright, something went wrong so i nuked my install and imported... again lol
gonna leave this to run overnight and see what my timeline looks like in the morning
ok progress i think - no more 2036 files and only two 1970 files this time
but still about 1000 photos from 5th may
gonna exiftool them and see if there's another value i can tweak
i really need to write some bash to find the json for the file now it's been moved
at least then i could just metadata rescan
skill issue
the journey continues...
had more of a google and found this
running now, it's still got some stuff in the error folder
guess there's not much i can do about that aside from set a fixed date for them
@jrasm91 apologies if this has already been asked, but are there plans to have Immich auto-read json files from a Google Takeout? Or maybe import them along with the image as a sidecar file (like the XML)
Not officially no. I need to do my own import and i may try to add support for it in some fashion
i imagine it'd help a lot of migrants like me hahaha
christ, i found my problem
the 1970 files and the files with the upload date as the file date
damn file names are too long so the file name and json don't match up
that's why the exif import is failing
lol this is just comical
woo i have managed to bash my way through this problem
just ran this on some test files and it worked great
running it now on my 1000ish problematic files
awesome, this has cracked it
i still have about 86 files that have the wrong date
but they are very wonky
some are edited, so fair enough they'll be bad
the others appear to have corrupt metadata
i reckon i could run ffmpeg over them to fix them, then try the json import again
alright, got something to do that
lets see how many this can fix
alright, some processed and some failed which is expected
45 files left
i'm calling that a win
Lol nice
OK, about to test all of this, but this guide should work
Yep, seems to work fine
Give it a whirl if you're having similar issues ๐
Small update to the script
Switched out "find" for "fd" as it's quicker