Mailinglist Archive: opensuse (4344 mails)

< Previous Next >
Re: PDF to TXT (ascii) ... helppppppp
  • From: Masaru Nomiya <nomiyac360@xxxxxxxxxxxxxx>
  • Date: Mon, 29 Aug 2005 11:19:19 +0900
  • Message-id: <87y86lfpso.wl%nomiyac360@xxxxxxxxxxxxxx>

>>>>> In the Message: [suse-linux-e ML: No.245666]
>>>>> with the date of Sun, 28 Aug 2005 13:27:42 -0400
>>>>> [Muara] == Maura Edelweiss Monville <memonvil@xxxxxxxxxxxxxxxx> has written:

Muara> -enc <string> : output text encoding name

Muara> WHat are the possible <string> choices ????

<string> means charset.
In Japanese, there exist 3 charsets, EUC-JP, Shift-JIS, ans ISO2022-JP.

Muara> What would be a sensible choice to get just a plain readable English
Muara> text ?

I downloaded an English pdf file, and just executetd

# pdftotext profile.pdf

then I got a plain text file.
I also did

# pdftotext -enc UTF-8 profile.pdf

this gave me a same result.
What's the matter, I wonder?

Could you show the result of the below operation;

# pdfinfo foo.pdf


# pdffonts foo.pdf


Masaru Nomiya mail-to: nomiyac360@xxxxxxxxxxxxxx

"No Windows, no gains!" ... "Why, I am wrong?"

-- Bill --

< Previous Next >
Follow Ups