Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"? TIA -- "Let your conversation be always full of grace." Colossians 4:6 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 21:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file?
Did you try opening the .doc with OO, and "Save As", choosing "Text" or "Text Encoded"? Cheers, Leen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2006/12/23 21:13 (GMT+0100) Leendert Meyer apparently typed:
On Saturday 23 December 2006 21:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file?
Did you try opening the .doc with OO, and "Save As", choosing "Text" or "Text Encoded"?
From a 790k .doc file text gets me a 34 byte file and text encoded gets me a 62 byte file. -- "Let your conversation be always full of grace." Colossians 4:6 NIV
Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 21:22, Felix Miata wrote:
On 2006/12/23 21:13 (GMT+0100) Leendert Meyer apparently typed:
On Saturday 23 December 2006 21:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file?
Did you try opening the .doc with OO, and "Save As", choosing "Text" or "Text Encoded"?
From a 790k .doc file text gets me a 34 byte file and text encoded gets me a 62 byte file.
Woa! A compression ratio of roughly 10k:1. But I'll assume you take that as a loss ratio and declared it a failure. How about an indirect conversion, like to .html? It would not be that difficult to strip the tags. Maybe only special formatting (tables, lists) would need some care... I hope there's no absolute positioning. BTW, what do you mean with 'glop'? Something aking to 'goo'? Cheers, Leen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2006/12/23 21:43 (GMT+0100) Leendert Meyer apparently typed:
On Saturday 23 December 2006 21:22, Felix Miata wrote:
On 2006/12/23 21:13 (GMT+0100) Leendert Meyer apparently typed:
On Saturday 23 December 2006 21:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file?
Did you try opening the .doc with OO, and "Save As", choosing "Text" or "Text Encoded"?
From a 790k .doc file text gets me a 34 byte file and text encoded gets me a 62 byte file.
Woa! A compression ratio of roughly 10k:1. But I'll assume you take that as a loss ratio and declared it a failure.
How about an indirect conversion, like to .html? It would not be that difficult to strip the tags.
Looks plenty difficult to me. The file size increased by about 40%. Both SeaMonkey and Konq fail to display anything legible when opening the result from disk. If I try to open the result in OO after closing it, It paints the first page, then nothing else other than pegging the CPU.
BTW, what do you mean with 'glop'? Something aking to 'goo'?
Everything except the content, roughly 80% of the .doc file, 90%+ of the html file. -- "Let your conversation be always full of grace." Colossians 4:6 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 22:00, Felix Miata wrote:
How about an indirect conversion, like to .html? It would not be that difficult to strip the tags.
Looks plenty difficult to me. The file size increased by about 40%. Both SeaMonkey and Konq fail to display anything legible when opening the result from disk. If I try to open the result in OO after closing it, It paints the first page, then nothing else other than pegging the CPU.
Auch! µ-zoft with µ-compatibility? :-( Last attempt: can you copy & paste? Does at least _that_ work? :-} Cheers, Leen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 12:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
TIA -- "Let your conversation be always full of grace." Colossians 4:6 NIV
Team OS/2 ** Reg. Linux User #211409
Felix Miata *** http://mrmazda.no-ip.com/
dos2unix Mike -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 13:04, Mike Noble wrote:
On Saturday 23 December 2006 12:00, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
TIA -- "Let your conversation be always full of grace." Colossians 4:6 NIV
Team OS/2 ** Reg. Linux User #211409
Felix Miata *** http://mrmazda.no-ip.com/
dos2unix
Mike Ignore my message, replied to soon without really reading fully :-)
Mike -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Antiword seems to work well. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Saturday 2006-12-23 at 15:00 -0500, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
Summary : library to import Microsoft Word documents Description : The wv2 library is used to import Microsoft Word documents in koffice for example. Summary : Word 8 Converter for Unix Description : WV is a program that can understand the Microsoft Word 8 binary file format (Office97). It currently converts Word into HTML, which can then be read with a web browser. (and there are html to text converters, I think). - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFFjZ0ctTMYHG2NR9URAnGSAJ9rk8hkjeQhDdGDxB6N5lSfKjBD7QCglDsB NwILuxbeeJbmSTr1KQl2vXw= =Rp7o -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2006/12/23 22:18 (GMT+0100) Carlos E. R. apparently typed:
The Saturday 2006-12-23 at 15:00 -0500, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
Summary : library to import Microsoft Word documents Description : The wv2 library is used to import Microsoft Word documents in koffice for example.
Summary : Word 8 Converter for Unix Description : WV is a program that can understand the Microsoft Word 8 binary file format (Office97). It currently converts Word into HTML, which can then be read with a web browser.
(and there are html to text converters, I think).
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-( -- "Let your conversation be always full of grace." Colossians 4:6 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 24 December 2006 00:03, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
I browsed the wv rpm with mc, there are a bunch of wv* in /usr/bin, and an explanation in /usr/share/doc/packages/wv/README. In short: you're looking for /usr/bin/wvWare. ;-) As for the wv2 rpm, there is a .so file. I guess it is used by wv, but I'm not sure. Maybe http://wvware.sf.net/ has a clue... Going there now. Cheers, Leen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 24 December 2006 00:13, Leendert Meyer wrote:
On Sunday 24 December 2006 00:03, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
I browsed the wv rpm with mc, there are a bunch of wv* in /usr/bin, and an explanation in /usr/share/doc/packages/wv/README.
In short: you're looking for /usr/bin/wvWare. ;-)
As for the wv2 rpm, there is a .so file. I guess it is used by wv, but I'm not sure. Maybe http://wvware.sf.net/ has a clue... Going there now.
Arg. "rpm -qi -p wv2-*.rpm" says it: a library to import .docs in KOffice. Cheers, Leen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2006/12/24 00:13 (GMT+0100) Leendert Meyer apparently typed:
On Sunday 24 December 2006 00:03, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
I browsed the wv rpm with mc, there are a bunch of wv* in /usr/bin, and an explanation in /usr/share/doc/packages/wv/README.
In short: you're looking for /usr/bin/wvWare. ;-)
Seems to be useless. Word8/97 is a virtually 10 year old file format. Whether I use wvHtml or wvText all I get for output is a 0 byte file, with no error messages from wvHtml, and the message "Could not convert to HTML" from wvText. :-( However, that README points to http://wvware.sourceforge.net/ which in turn recommends using abiword instead. Abiword shows up in the menu, and creates HTML that SeaMonkey can open, and usable plain text. :-) Abiword has a much longer list of file formats it can import and export than OO. So, why is OO installed by default instead of Abiword? -- "Let your conversation be always full of grace." Colossians 4:6 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Saturday 2006-12-23 at 19:43 -0500, Felix Miata wrote:
In short: you're looking for /usr/bin/wvWare. ;-)
Seems to be useless. Word8/97 is a virtually 10 year old file format. Whether I use wvHtml or wvText all I get for output is a 0 byte file, with no error messages from wvHtml, and the message "Could not convert to HTML" from wvText. :-(
Some of the wv* things are scripts. I have used "wvText" on some recent .doc files and it works fine. It is possible that your file is too complex or someway incompatible.
However, that README points to http://wvware.sourceforge.net/ which in turn recommends using abiword instead. Abiword shows up in the menu, and creates HTML that SeaMonkey can open, and usable plain text. :-)
Abiword has a much longer list of file formats it can import and export than OO. So, why is OO installed by default instead of Abiword?
Is that a serious question? :-O - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFFjeNztTMYHG2NR9URArnaAJ4xAg10kVCYkZh/W9Rx4WaFiv7UJwCdERHB I+t33kkQr86jpp+YIsaPxH8= =TOd6 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 23 Dec 2006, robin.listas@telefonica.net wrote:
Some of the wv* things are scripts. I have used "wvText" on some recent .doc files and it works fine. It is possible that your file is too complex or someway incompatible.
The stand alone scripts are not maintained anymore, in flavour of using Abiword as a commandline convertor. Charles -- panic("Attempted to kill the idle task!"); linux-2.2.16/kernel/exit.c
On Sat, 2006-12-23 at 18:03 -0500, Felix Miata wrote:
On 2006/12/23 22:18 (GMT+0100) Carlos E. R. apparently typed:
The Saturday 2006-12-23 at 15:00 -0500, Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
Summary : library to import Microsoft Word documents Description : The wv2 library is used to import Microsoft Word documents in koffice for example.
Summary : Word 8 Converter for Unix Description : WV is a program that can understand the Microsoft Word 8 binary file format (Office97). It currently converts Word into HTML, which can then be read with a web browser.
(and there are html to text converters, I think).
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-( whereis wv
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2006/12/23 18:24 (GMT-0500) Michael S. Dunsavage apparently typed:
On Sat, 2006-12-23 at 18:03 -0500, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
whereis wv
Nothing useful there, just /usr/share/wv with a bunch of xml files. -- "Let your conversation be always full of grace." Colossians 4:6 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 18:09, Felix Miata wrote:
On 2006/12/23 18:24 (GMT-0500) Michael S. Dunsavage apparently typed:
On Sat, 2006-12-23 at 18:03 -0500, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
whereis wv
Nothing useful there, just /usr/share/wv with a bunch of xml files.
I dunno. I suppose it's a bit unintuitive to try rpm -ql to list the contents of a package, but it gives me the following: scott@inigo:~>rpm -ql wv /usr/bin/wvAbw /usr/bin/wvCleanLatex /usr/bin/wvConvert /usr/bin/wvDVI /usr/bin/wvDocBook /usr/bin/wvHtml /usr/bin/wvLatex /usr/bin/wvMime /usr/bin/wvPDF /usr/bin/wvPS /usr/bin/wvRTF /usr/bin/wvSummary /usr/bin/wvText /usr/bin/wvVersion /usr/bin/wvWare /usr/bin/wvWml [...] /usr/share/doc/packages/wv/README [...] /usr/share/man/man1/wvAbw.1.gz /usr/share/man/man1/wvCleanLatex.1.gz /usr/share/man/man1/wvDVI.1.gz /usr/share/man/man1/wvHtml.1.gz /usr/share/man/man1/wvLatex.1.gz /usr/share/man/man1/wvMime.1.gz /usr/share/man/man1/wvPDF.1.gz /usr/share/man/man1/wvPS.1.gz /usr/share/man/man1/wvRTF.1.gz /usr/share/man/man1/wvSummary.1.gz /usr/share/man/man1/wvText.1.gz /usr/share/man/man1/wvVersion.1.gz /usr/share/man/man1/wvWare.1.gz /usr/share/man/man1/wvWml.1.gz [...] -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 16:09, Felix Miata wrote:
On 2006/12/23 18:24 (GMT-0500) Michael S. Dunsavage apparently typed:
On Sat, 2006-12-23 at 18:03 -0500, Felix Miata wrote:
I installed wv and wv2 with YaST, but they haven't shown up in the menus, and wv from konsole gives command not found, even though rpm claims they're installed. I can't find wv in /bin, /sbin, /usr/bin or /usr/sbin. :-(
whereis wv
Nothing useful there, just /usr/share/wv with a bunch of xml files.
Then it's installed. Keep "apropos" or "man -k" in your repertoire: % apropos wv wvdialconf (1) - build a configuration file for wvdial (1) wvWare (1) - convert msword documents wvAbw (1) - convert msword documents to Abiword's format wvDVI (1) - convert msword documents to DVI wvHtml (1) - convert msword documents to HTML4.0 wvLatex (1) - convert msword documents to LaTeX wvCleanLatex (1) - convert msword documents to LaTeX wvPDF (1) - convert msword documents to PDF wvPS (1) - convert msword documents to PS wvRTF (1) - convert msword documents to RTF wvText (1) - convert msword documents to text wvWml (1) - convert msword documents to WML nwvolinfo (1) - Diplay info on NetWare Volumes wvdial (1) - PPP dialer with built-in intelligence. wvMime (1) - view MSWord documents wvSummary (1) - view word document's summary info wvVersion (1) - view word document's version # wvline (3ncurses) - create curses borders, horizontal and vertical lines mvwvline (3ncurses) - create curses borders, horizontal and vertical lines mvwvline_set (3ncurses) - create curses borders or lines using complex characters and renditions wvline_set (3ncurses) - create curses borders or lines using complex characters and renditions wvdial.conf (5) - wvdial configuration file Obviously some of these are irrelevant, but I'm too lazy to edit them out right now. Judging from the synopses, "wvWare" is the main command to use: % man wvWare ... DESCRIPTION wvWare converts word documents into other formats such as PS,PDF,HTML,LaTeX,DVI,ABW Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
TIA
Why not just save it as a text (.txt) file??? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 23 December 2006 13:21, James Knott wrote:
Felix Miata wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
TIA
Why not just save it as a text (.txt) file???
I used to use a Linux utility called 'antiword' to read MSWord .doc files and produce plain text that could be indexed with a utility called glimpse. As I recall it did a pretty good job of preserving the formatting--as much as is possible with ascii text--and it certainly removed the 'glop'. Jim Cunning -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 23 Dec 2006, mrmazda@ij.net wrote:
Which app do we have to strip the glop from a M$ .doc file and output just the content to a plain text file? OO print to file doesn't seem to understand anything but postscript. Do I need to "install" a "text printer"?
(1) antiword: http://www.winfield.demon.nl/ (2) catdoc: http://www.45.free.net/~vitus/software/catdoc/ (3) Abiword can be used as a commandline convertor (libwv) (4) If you use Emacs: http://www.emacswiki.org/cgi-bin/wiki/UnDoc Charles -- panic("aha1740.c"); /* Goodbye */ linux-2.2.16/drivers/scsi/aha1740.c
participants (11)
-
Carlos E. R.
-
Charles philip Chan
-
Felix Miata
-
Francesco Scaglioni
-
James Knott
-
Jim Cunning
-
Leendert Meyer
-
Michael S. Dunsavage
-
Mike Noble
-
Randall R Schulz
-
Scott Jones