On 2014-08-28 05:39, Anton Aylward wrote:
On 08/27/2014 11:09 PM, John Andersen wrote:
But not all documents have text inside. Some are just images of text, especially if you scanned them into pdfs.
What you're saying is that the PDF is not of text but is an image. it just happens to be an image of text. We can read it just as we can read the test on a photography of text. But its not text in the sense of a string of ASCII.
So yes I can use the 'select' tool and instead of saving as text I save as bit map. And posting THAT into vim gives me garbage.
But pasting into LibreOffice (and some other powerful editors) would paste the image. And pasting into editors that can not handle images, simply fails. A helpful editor might pop a message saying that "no, you can not paste images". I don't know what vim does, though.
Which gets back to an interesting question.
if the PDF is an image anyway then how can the KDE3 tool read i s text? is there some image-to-text going going on? What if its in a strange 'artistic' font that humans have no problems with...?
No, it means that PDF is text. Some PDFs may display a text like: «Hello world, I'm here!» and when you select and paste it you get instead: «world, Hello I'm here!» Because you can position words one by one in PDFs, and some do. When copy pasting it, the result may be disastrous. Some PDF viewers cope better than others in this situation. Another issue are strange fonts, yes. A PDF file can include its own font definitions inside. Pasting that copies the corresponding ascii (utf?) codes of the letters, not what the letters themselves display in the PDF. That would be a dirty trick indeed. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)