Mailinglist Archive: opensuse (1473 mails)

< Previous Next >
Re: [opensuse] Re: multi-page continuous scanner anyone?
  • From: "David C. Rankin" <drankinatty@xxxxxxxxxxxxxxxxxx>
  • Date: Sat, 27 Jun 2009 22:32:03 -0500
  • Message-id: <200906272232.04265.drankinatty@xxxxxxxxxxxxxxxxxx>
On Saturday 20 June 2009 03:28:07 am Carlos E. R. wrote:
On Thursday, 2009-06-18 at 15:51 -0700, Randall R Schulz wrote:
On Thursday June 18 2009, Carlos E. R. wrote:
On Thursday, 2009-06-18 at 16:36 -0400, Greg Freemyer wrote:
Looks like it only goes up to 600x600 dpi optical, though.

For document archive 600x600 is overkill.

Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.

If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the

I agree, and 600 dpi won't get you a particularly faithful reproduction.
Phototypsetting equipment realizes 2400 DPI, typically.

600 dpi happens to be my printer resolution, so going further would be
pointless ;-)

Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.

Maybe in the future.

Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.

I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.

Yep. Good OCR for me is almost impossible to achieve, but these big chaps
seems to have it solved.

Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.

It is a very good format for scanned material, but it doesn't seem to
catch :-?

Just to add to the OCR discussion, I have had good luck with tesseract. I use
it as part of our hylafax/avantfax fax server that automatically does OCR on
incoming faxes at our office....

David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >