C
C#17mo ago
Alex Frost

❔ Adding OCR text to PDF file

I have PDF files with scanned pages. I want to keep source image without resizing it, performing OCR and adding OCR text to the file. What I have for now is, I use PdfSplitter which converts pages to images, upscaling them, then tesseract to OCR the images and construct a PDF file. Due to upscaling, the resulting file gains significant size and if OCR is performed multiple times due to un-desired results, the file keeps gaining size.
1 Reply
Accord
Accord17mo ago
Looks like nothing has happened here. I will mark this as stale and this post will be archived until there is new activity.