Mailinglist Archive: opensuse (1473 mails)

< Previous Next >
Re: [opensuse] Re: multi-page continuous scanner anyone?
  • From: "David C. Rankin" <drankinatty@xxxxxxxxxxxxxxxxxx>
  • Date: Mon, 29 Jun 2009 00:19:43 -0500
  • Message-id: <200906290019.43700.drankinatty@xxxxxxxxxxxxxxxxxx>
On Sunday 28 June 2009 09:26:16 am Adam Tauno Williams wrote:
On Sat, 2009-06-27 at 22:32 -0500, David C. Rankin wrote:
On Saturday 20 June 2009 03:28:07 am Carlos E. R. wrote:
On Thursday, 2009-06-18 at 15:51 -0700, Randall R Schulz wrote:
On Thursday June 18 2009, Carlos E. R. wrote:
On Thursday, 2009-06-18 at 16:36 -0400, Greg Freemyer wrote:
Looks like it only goes up to 600x600 dpi optical, though.

For document archive 600x600 is overkill.

Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.

If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the

I agree, and 600 dpi won't get you a particularly faithful
reproduction. Phototypsetting equipment realizes 2400 DPI, typically.

600 dpi happens to be my printer resolution, so going further would be
pointless ;-)

Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.

I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.

Yep. Good OCR for me is almost impossible to achieve, but these big
chaps seems to have it solved.
Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.
It is a very good format for scanned material, but it doesn't seem to
catch :-?

Just to add to the OCR discussion, I have had good luck with tesseract. I
use it as part of our hylafax/avantfax fax server that automatically does
OCR on incoming faxes at our office....

How about posting your Hylafax faxrcvd script so other can use it as a
template? Or a link if you used some site/howto for setting it up.


The Package I uses was Avantfax. I set up a page that is a short howto:

David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >