Mailinglist Archive: opensuse (4633 mails)

< Previous Next >
Re: [opensuse] doc to txt conversion
  • From: Leendert Meyer <leen.meyer@xxxxxxx>
  • Date: Sat, 23 Dec 2006 21:43:06 +0100
  • Message-id: <200612232143.07106.leen.meyer@xxxxxxx>
On Saturday 23 December 2006 21:22, Felix Miata wrote:
> On 2006/12/23 21:13 (GMT+0100) Leendert Meyer apparently typed:
> > On Saturday 23 December 2006 21:00, Felix Miata wrote:
> >> Which app do we have to strip the glop from a M$ .doc file and
> >> output just the content to a plain text file?
> >
> > Did you try opening the .doc with OO, and "Save As", choosing
> > "Text" or "Text Encoded"?
>
> From a 790k .doc file text gets me a 34 byte file and text encoded
> gets me a 62 byte file.

Woa! A compression ratio of roughly 10k:1. But I'll assume you take
that as a loss ratio and declared it a failure.

How about an indirect conversion, like to .html? It would not be that
difficult to strip the tags. Maybe only special formatting (tables,
lists) would need some care... I hope there's no absolute
positioning.

BTW, what do you mean with 'glop'? Something aking to 'goo'?

Cheers,

Leen
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >
Follow Ups