[Bug 481286] New: Tesseract OCR engine from BS::home:/StefanBruens:/OCR has empty ./tessdata files
https://bugzilla.novell.com/show_bug.cgi?id=481286 Summary: Tesseract OCR engine from BS::home:/StefanBruens:/OCR has empty ./tessdata files Classification: openSUSE Product: openSUSE 10.3 Version: Final Platform: i586 OS/Version: openSUSE 10.3 Status: NEW Severity: Normal Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: drankinatty@suddenlinkmail.com QAContact: qa@suse.de Found By: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009012700 SUSE/3.0.6-0.1 Firefox/3.0.6 FirePHP/0.2.4 Stefan, Thanks for building tesseract 2.03 for 10.3. However, there is a small problem with the /usr/share/tessdata files provided with the rpm. They are all empty: -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.DangAmbigs -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.freq-dawg -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.inttemp -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.normproto -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.pffmtable -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.unicharset -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.user-words -rw-r--r-- 1 root root 0 2008-05-03 20:05 deu.word-dawg <snip> I don't know why? I simply downloaded the source files and copied the english modules over and everything else works fine: -rw-r--r-- 1 root root 392 2009-02-14 00:42 eng.DangAmbigs -rw-r--r-- 1 root root 672 2009-02-14 00:42 eng.freq-dawg -rw-r--r-- 1 root root 862544 2009-02-14 00:42 eng.inttemp -rw-r--r-- 1 root root 39862 2009-02-14 00:42 eng.normproto -rw-r--r-- 1 root root 590 2009-02-14 00:42 eng.pffmtable -rw-r--r-- 1 root root 480 2009-02-14 00:42 eng.unicharset -rw-r--r-- 1 root root 7289 2009-02-14 00:42 eng.user-words -rw-r--r-- 1 root root 809728 2009-02-14 00:42 eng.word-dawg The source files I used were from: http://tesseract-ocr.googlecode.com/files/tesseract-2.00.eng.tar.gz I bet you just need to place the /tessdata language files in the /tesseract directory before creating the source file to build the rpm and that should take care of it. Thanks again. Reproducible: Always Steps to Reproduce: 1.install http://download.opensuse.org/repositories/home:/StefanBruens:/OCR/openSUSE_1... 2. run tesseract 3. Actual Results: error unable to load eng.* language files Expected Results: perform ocr on supplied tiff -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=481286 User chrubis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=481286#c1 Cyril Hrubis <chrubis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.pr |lurch@gmx.li |ovo.novell.com | --- Comment #1 from Cyril Hrubis <chrubis@novell.com> 2009-03-17 08:04:25 MST --- Okay tesseract is not maintained by suse, so moving to Stefan Bruens bugzilla account. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com