Mailinglist Archive: opensuse (3280 mails)

< Previous Next >
Re: [opensuse] PDF OCR
  • From: Kai Ponte <kai@xxxxxxxxxxxxxxxx>
  • Date: Wed, 12 Dec 2007 13:46:27 -0800
  • Message-id: <200712121346.27472.kai@xxxxxxxxxxxxxxxx>
On Wednesday 12 December 2007 10:52, Ken Schneider wrote:
Roger Oberholtzer pecked at the keyboard and wrote:
Hello

We have a network printer that will scan docs and send them as pdf docs
to an e-mail address in the company. Is there any software with OpenSUSE
10.3 that can do OCR from a PDF doc? I am guessing that the doc contains
tiff images of the scanned documents. Any and all pointers are welcome.

Have you tried pdftotext ?


I will happily recommend Tesseract.

http://code.google.com/p/tesseract-ocr/

Here's a how-to on how to do PDF to text, though I've yet to be able to
convert PDF to TIFF yet...

http://www.groklaw.net/articlebasic.php?story=20061210115516438

And a few more articles...

http://www.linuxjournal.com/article/9676

http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >