Re: [SLE] What can I use for OCR?

20 Nov 2002

      On Wednesday 20 November 2002 04:51 am, John Pettigrew wrote:
...
In a previous message, Carlos E. R. wrote:
...
And now, gocr in linux (I have only removed some empty lines to
save space):
+++
  (PICTURE)
  d_T j Uly, _' (lr Illy _l C_t-d Ilnl Vtr-
        sary c_nlumn, l urEURed comput-
[snip]
Hmmm. I've not tried OCR in Linux, but from my experience of programs
on other platforms (no, not Windows  :-) that looks like it's caused
by the input bitmap being wrong in some way. Does gocr require a
specific bit depth, or resolution/font size? If it was a greyscale
image, was the contrast between the letters and background high
enough? If 1-bit, was there any background noise?
The thing I've found is that the bitmap that you feed the OCR program
needs to be as high quality as possible, and that it matches the
specified resolution/font size of the program. I never auto-OCR
because I often get better results by checking the bitmap before
feeding it to the OCR engine, and it saves wasted time when there's
something wrong.
John
--------------------------

Another good point is that when using OCR, scan in Binary mode, not 
greyscale.  I have used OCR in Kooka with very good results, scanning 
in as Binary and the higher the resolution you use the greater the 
results it seems in character recognition.  I know xsane uses gocr and 
I did some scans there last evening.  The page looked good, although I 
am not experienced in xsane or gocr, so unsure if I was even doing OCR 
scans at the time.  Kooka OCR has worked well for me with good results.  
It may also be using gocr, but I couldn't find anything that indicated 
that. 

Patrick
  --- KMail v1.4.3 --- SuSE Linux Pro v8.1 ---
	Registered Linux User #225206