Optical Character Recognition software recommendations?

Tesseract OCR Install Tesseract OCR

The original engine was developed back in the late 80's by HP and IBM but it has proven to be one of the best Ocular Recognition Softwares I've used. It's recently undergone many updates to the engine and has become one of the most comprehensive OCR tools on the market. Outscoring against most all other OCR tools (with something in the higher 90 percentile of text matches) it can easily transform standard document type-face to text.

The following is an example:

tesseract ScannedDocument.png out

Will produce a file called out.txt


Another project that should be able to do this is gscan2pdf

sudo apt-get install gscan2pdf

This project can also use Tesseract, as well as other open source OCR tools.


I dont know any OCR for Ubuntu, but for Windows there is one that have the features you need. That is ABBYY FineReader this is the page but it is not free