-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-12-01 11:14, Vojtěch Zeisek wrote:
Hi, is there any working tool which is able to add text layer into scanned PDF? I tried YAGF (front-end for cuneiform and/or tesseract), but it seems to have only option to save the text as separate TXT file. Cuneiform also doesn't have this possibility and tesseract I wasn't able to get to work (script OCRmyPDF was always complaining about missing tesseract even it was installed). Scantailor seems to lack this functionality. Ocrad wasn't able to start (and no error message produced) and gocr isn't able to work with PDF... Some old demo version of Vuescan I have requires libgtk-X11 which is unavailable.
You could setup a virtualized guest with an older openSUSE that has the required libraries.
And it is not the cheapest software... Tragedy. Any other suggestions? ;-)
If you ask for ideas... ;-) Personally, I consider PDF a very bad format for scanned documents; I prefer "dejavu", which is designed for that very purpose. It is, however, not popular. There is open software to create the files, and text can be added although I've never tried. However, the available opensource is, let's say, fully functional but clumsy. There is proprietary software that is, they claim, much easier to use. However, OS can be easily scripted... some samples: djvusmooth - Graphical Text Editor for DjVu pdf2djvu - PDF to DjVu Converter djvu2pdf - Converting Djvu Files to PDF Files djvulibre-doc - Documentation for the the DjVu - djvulibre djvulibre-djview4 - Portable DjVu Qt4 Based Viewer and Browser Plugin djvutxt - Extract the hidden text from DjVu documents. djvused - Multi-purpose DjVu document editor. djvulibre - An Open Source Implementation of DjVu DjVu is a Web-centric format and software platform for distributing documents and images. DjVuLibre is an open source (GPL) implementation of DjVu, including viewers, browser plug-ins, decoders, simple encoders, and utilities. DjVu can advantageously replace PDF, PS, TIFF, JPEG, and GIF for distributing scanned documents, digital documents, or high-resolution pictures. DjVu content downloads faster, displays and renders faster, looks nicer on a screen, and consumes less client resources than competing formats. DjVu images display instantly and can be smoothly zoomed and panned with no lengthy rerendering. DjVu is used by hundreds of academic, commercial, governmental, and noncommercial Web sites around the world. DjVuDocument DjVuDocument is a compression technique specifically designed for color digital documents images containing both pictures and text, such as a page of a magazine. DjVuDocument represents images into separately compressed layers. The foreground layer is usually compressed with DjVu Bitonal and contains the text and drawings. The background layer is usually compressed with DjVuPhoto and contains the background texture and the pictures at lower resolution. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEUEARECAAYFAlR8f6QACgkQtTMYHG2NR9XStQCXS8hJFuqh/69IB8ocQqRMiV7R NACdERKRfPF2Q2tYQBLCxGfgN0fGvyc= =KlHa -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org