[opensuse-doc] Wiki to PDF? Using Wiki as a doc source?
I've been poking around in the info on the LfL (FOSDEM presentation, mailing list archives etc), and have a question... or two...
From what I've gathered, doc input is all done in DocBook XML (or flat text that is converted to DocBook XML). Output formats for LfL are PDF, HTML and Wiki.
Is there any thought or (future) plan to use MediaWiki as an input source? Has anyone successfully managed to go from MediaWiki source to XML and then on (via Apache FO) to PDF? (the XML to PDF is relatively easy... it's the Wiki to XML that's looking ugly to me). C. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
On Wed, 25 Apr 2007, Clayton wrote:
Is there any thought or (future) plan to use MediaWiki as an input source?
That'd be pretty ugly, IMHO. We have thought of it and it would be wonderful if there were any way that wouldn't involve messy tweaking of the "so-called" XML you'd get if you ran some basic conversion scripts over the Wiki files. XML to Wiki, as you said, is pretty straightforward, but going from Wiki to XML you'd get just the very basic of markup if any and would have to do lots and lots of manual tweaking to the Wiki-originated XML, before it blend in with the other generic doc.
Has anyone successfully managed to go from MediaWiki source to XML and then on (via Apache FO) to PDF? (the XML to PDF is relatively easy... it's the Wiki to XML that's looking ugly to me).
Yep. Ugly is the right word for it, I am afraid. The XML to PDF being easy depends on whether your layouts and stylesheets are supported by FOP :( -- Jana Jaeger jjaeger@suse.de SUSE LINUX Products GmbH Documentation Maxfeldstr. 5 +49 (0) 911 74053-0 D-90409 Nuernberg http://www.novell.com/linux SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
On Wed, 25 Apr 2007, Clayton wrote:
I've been poking around in the info on the LfL (FOSDEM presentation, mailing list archives etc), and have a question... or two...
From what I've gathered, doc input is all done in DocBook XML (or flat text that is converted to DocBook XML). Output formats for LfL are PDF, HTML and Wiki.
Right. Other formats would be possible by adding the necessary styleheets, which means if docbook supports another format, it will be relatively easy to convert to that format.
Is there any thought or (future) plan to use MediaWiki as an input source?
Has anyone successfully managed to go from MediaWiki source to XML and then on (via Apache FO) to PDF? (the XML to PDF is relatively easy... it's the Wiki to XML that's looking ugly to me).
You are absolutely right. This is one of the weak points with DocBook. There are many export formats, but for the input, it relies on filters provided by the respective other format. One aspect of this conversion is how much information exists in the original format, that may be used in docbook. MediaWiki (and all other wikis I know of) have a very simple format that does not care for syntax but only for layout. You could do wild guesses about how to convert wiki to xml, but you will never be able to convert it in a way that complies with editing standards of DocBook. If you plan to migrate the source of a text from MediaWiki to XML, you might want to have a look at http://tools.wikimedia.de/~magnus/wiki2xml/w2x.php (use the option "raw wikitext" when entering MediaWiki input). But this is still no option for us to automate such a process, and thus we will have to stick with DocBook. Berthold -- ------------------------------------------------------------------ Berthold Gunreben SUSE Linux GmbH -- Dokumentation mailto:bg@suse.de Maxfeldstr. 5 http://www.suse.de/ D-90409 Nuernberg, Germany SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Interesting to your take on it :-) It is a real challenge to make it work. I have managed in a previous life (with a incredible amount of help from a Java wizard) to get fairly successful conversions from JSPWiki to PDF using what the developer appropriately called the "world's most evil awk script" and some stylesheets which was then pushed through RenderX. The result was a reasonably good PDF to which a final manual publishing step could be applied (adding corp identity, watermarks etc). This wasn't done in the open source world though, which is where I am now.
One aspect of this conversion is how much information exists in the original format, that may be used in docbook.
Well, with what I'm working on (in OpenOffice.org) it will all be maintained in Wiki format, with a "need" (?) to produce some kind of snapshot... HTML, PDF... something. http://wiki.services.openoffice.org/wiki/Documentation The need part is still under discussion... there is some reluctance to leave behind traditional publishing methods by all of us in documentation :-)
If you plan to migrate the source of a text from MediaWiki to XML, you might want to have a look at
I've come across this in my search. I am not 100% sure how happy I am with it just yet. From what I read, the original author has effectively abandoned it and there are a few issues with it that need cleaning up... bug fixing etc. It's still in the list. I will keep hunting... :-) Maybe something usable will come out of this that can be shared here too. C. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
On Wed, 25 Apr 2007, Clayton wrote:
One aspect of this conversion is how much information exists in the original format, that may be used in docbook.
Well, with what I'm working on (in OpenOffice.org) it will all be maintained in Wiki format, with a "need" (?) to produce some kind of snapshot... HTML, PDF... something.
Wiki to HTML is easy. Just look at the wiki, most of what is does is exactly this conversion. Doing a reasonable good PDF is something very different. Other things like profiling seem to be impossible in wiki syntax at all. Some time ago I also searched for a possibility to convert HTML or Wiki to PDF. This can be done somehow (just press print on your browser....) but having XML source with the possibility to use FO stylesheets is very different in the result. Berthold -- ------------------------------------------------------------------ Berthold Gunreben SUSE Linux GmbH -- Dokumentation mailto:bg@suse.de Maxfeldstr. 5 http://www.suse.de/ D-90409 Nuernberg, Germany SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Berthold Gunreben wrote:
Some time ago I also searched for a possibility to convert HTML or Wiki to PDF. This can be done somehow (just press print on your
speaking about pdf is not a good practice here. any printable thing can make a very good pdf (as of file format) the second part of the subject is much better. jdd -- http://www.dodin.net Cécile, esthéticienne à Montpellier (à domicile) http://gourmandises.orangeblog.fr/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Berthold Gunreben wrote:
will have to stick with DocBook.
eventually, as a start, one can use the html (print) layout of mediawiki (see print version on the left of the wiki page) and some tool html to docbook however this is very minimal. You can also see the numerous discussion on this subject on the tldp (linuxdoc) site. there anybody can submit a text only (or, by the way, any opensource format) HOWTO and somebody else can make a docbook source. This is important, because it can arise than very important (expert) work can be found only this way (a expert on any Linux part may not be a docbook writer) but this need voluteers to do the translation :-(( jdd -- http://www.dodin.net Cécile, esthéticienne à Montpellier (à domicile) http://gourmandises.orangeblog.fr/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
On Wed, 25 Apr 2007, jdd wrote:
Berthold Gunreben wrote:
will have to stick with DocBook.
there anybody can submit a text only (or, by the way, any opensource format) HOWTO and somebody else can make a docbook source.
This is important, because it can arise than very important (expert) work can be found only this way (a expert on any Linux part may not be a docbook writer)
but this need voluteers to do the translation :-((
If I remember correctly, we also offered this on this list some time ago. At least I have done that once in a while, and it is not too hard to do. In reality this is not only placing XML tags around text, often you also have to restructure sections, add intros, and create some graphics. Sometimes it is even correcting errors ... However, if you know of such a case, don't hesitate to ask on this list, I believe that we will find a solution. Berthold -- ------------------------------------------------------------------ Berthold Gunreben SUSE Linux GmbH -- Dokumentation mailto:bg@suse.de Maxfeldstr. 5 http://www.suse.de/ D-90409 Nuernberg, Germany SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Hello, on Mittwoch, 25. April 2007, Berthold Gunreben wrote:
One aspect of this conversion is how much information exists in the original format, that may be used in docbook. MediaWiki (and all other wikis I know of) have a very simple format that does not care for syntax but only for layout. You could do wild guesses about how to convert wiki to xml, but you will never be able to convert it in a way that complies with editing standards of DocBook.
Just a quick idea: what about adding "semantic" templates to the wiki? I'm thinking about {{menu|File}} and similar... This way the wiki source could have all information needed for conversion to docbook. (Enforcing the use of these semantic templates is another story of course ;-) Regards, Christian Boltz -- Microsoft's Director .Net Strategy & Developer Group: "Ein Umstieg auf Linux ist erst einmal mit hoeheren Kosten verbunden." - CTO SuSE Linux AG: "Das ist, als wenn ich heroinsüchtig wäre und sagen würde, die Entziehungskur ist mir zu aufwändig. Also lasse ich es lieber bleiben." http://www.computerwoche.de/index.cfm?pageid=254&artid=41859&type=detail --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Christian Boltz <opensuse@cboltz.de> writes:
Just a quick idea: what about adding "semantic" templates to the wiki?
I'd rather vote to enhance the wiki to make it accept DocBook XML as an input format. OTOH, in technical books you often do not need all this semantic markup. The mediawiki file format is rich enough for computer related books.
I'm thinking about {{menu|File}} and similar...
I believe people do not like curly braces. Curly braces are one of the reasons why LaTeX "failed". BTW, we try solve a problem that does not exist. Look at it from another point of view. If you want to write a book, use a markup system suitable for books (XML, Texinfo, LaTeX, Frame (mif), etc.). If you want to write wiki-like online articles with fast turn around times and want others to contribute to them, use a simple markup system such as basic HTML or the "standard" mediawiki format. Later, if it turns out, you need some of the online articles as sections of a book, convert them to XML--manually or with the help of some editor marcos. -- Karl Eichwalder R&D / Documentation SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
I'd rather vote to enhance the wiki to make it accept DocBook XML as an input format.
While this would make live easier for generating output formats, it doesn't really do much for the end user who is simply looking for a quick and dirty way to contribute to an open source project... does it? The entry threshold for editing XML isn't unreachably high, but it is inconvenient, and enough to stop a whole lot of people from adding their knowledge to the documentation. I see it all the time with the stuff I work on... and it's really not a lot different here in L4L (they have to add a repository, check out the docs from svn, edit them locally in some xml editor, validate and check them all back in) from OpenOffice. (they have to create a child workspace, check out the documentation, edit the xml in some xml editor, validate and check it all back in again). Those who work with it all the time don't think twice about it.... a new users sees the "Quick Start" process... and don't bother. How much community input is there in the official openSUSE docs... or in the L4L project... or in OpenOffice. There is some in each case, but not much. The advantage of Wiki format for input is that the syntax is simple - or if you have Javascript, you don't even need to know Wiki syntax... just click the buttons above the edit window. It's quick, and you can edit on the fly (ok, potential for content abuse, but that's a risk I'm willing to manage). Maybe I'm missing the point though :-) It's been known to happen more than once with me.
The mediawiki file format is rich enough for computer related books.
I would agree - for the most part, all the testing I've done with porting an existing book to Wiki format has gone quite good... getting it back out has been the issue (and why I started this thread). The end users (from developers who need to write content to end users who want to add little extra bits) are all quite interested in seeing the book migrate from it's difficult to manage existing process onto the Wiki. The only other thing I've been struggling with is the total lack of linear navigation in Wikis. I think I've found one solution that seems to work nicely. I've hacked up a description of it here: http://wiki.services.openoffice.org/wiki/Template:NavigationTemplate It allows for simple guided navigation (in this case, in a box on the RH side of every page) using block and none to show and hide sections. I'm still working on various ideas for extracting Wiki to XML. C. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Clayton wrote:
doesn't really do much for the end user who is simply looking for a quick and dirty way to contribute to an open source project...
I beg this discussion is quite pointless. There _is_ an entry point for users input, it's the wiki. is it interesting to translate the wiki to docbook (manually) is all the story. It's for me obvious than the problem is _not_ the text/mediawiki syntax, but the wiki overall layout. and this is a very old problem, I can remember discussion at hyperlink time, at the very beginning of (apple?) application release, 20 years (?) ago. wiki is an hyperlinked layout, as is html and web. This is fundamentally different from a book. It's easy to translate a book to html (but you may know many people prefere pdf, that is keep the book layout), but the other way round is not really possible, whatever index you add to the book. So if on the subject you want the book speak of, you find wiki ressources, translating it is a matter of days. You can even create a portal on this subject, call for submissions and go on. I even thought initially that L4L was such a project. jdd -- http://www.dodin.net Cécile, esthéticienne à Montpellier (à domicile) http://gourmandises.orangeblog.fr/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Hello, Karl, please forget all your docbook knownledge for some minutes ;-) and then read this mail. on Freitag, 27. April 2007, Karl Eichwalder wrote:
Christian Boltz <opensuse@cboltz.de> writes:
Just a quick idea: what about adding "semantic" templates to the wiki?
I'd rather vote to enhance the wiki to make it accept DocBook XML as an input format.
I agree with Clayton here - many people are not familar with XML and prefer the wiki syntax.
OTOH, in technical books you often do not need all this semantic markup. The mediawiki file format is rich enough for computer related books.
So why isn't LfL maintained in a mediawiki? ;-)) (I guess because it needs some things the mediawiki format doesn't offer, correct?)
I'm thinking about {{menu|File}} and similar...
I believe people do not like curly braces. Curly braces are one of the reasons why LaTeX "failed".
I don't think so. The reason why more people use OpenOffice is WYSIWYG. And because you can "design" your documents easier (well, usually this shouldn't be done at all, but people like to do it ;-)
BTW, we try solve a problem that does not exist. Look at it from another point of view. [...] Later, if it turns out, you need some of the online articles as sections of a book, convert them to XML--manually or with the help of some editor marcos.
Exactly this is the point - with some wiki templates, we could have fully _automatic_ conversion to XML. This would also mean easier editing (no SVN knownledge needed, and it's nearly impossible to have syntax errors in mediawiki ;-) We could even maintain LfL in a wiki which would mean new authors wouldn't need to learn docbook. Maybe this would also attract more authors to write for LfL... Just compare the numbers - how many wiki editors does openSUSE have? And how many LfL editors? Regards, Christian Boltz --
Weiß jemand was suse 9.0 für Mindestvoraussetzung brauch? Einen der damit umgehen kann... [> Marcel Stein und Roman Langolf in suse-linux]
To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
On Sunday 29 April 2007 09:14, Christian Boltz wrote:
We could even maintain LfL in a wiki which would mean new authors wouldn't need to learn docbook. Maybe this would also attract more authors to write for LfL...
It would be beneficial if it would be lfl.opensuse.org. The en.opensuse.org is overloaded with translation project support.
Just compare the numbers - how many wiki editors does openSUSE have? And how many LfL editors?
Not counting professionals for sure more > 1 ;-) -- Regards, Rajko. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
Christian Boltz wrote:
Exactly this is the point - with some wiki templates, we could have fully _automatic_ conversion to XML.
this is nearly impossible. wiki and docbook are simply not compatible. it's even difficult to write a plain docbook editor (I don't know any really good -emacs excepted-) docbook requires some rules the wiki cannot even think of :-) (such as using herarchical titles - on mediawiki you can mix any title level at will) what can be done, quite easily, is to include wikipages in docbook document "as it", that is as <pre></pre> tag, letting to a manual editor to insert there the good syntax. of course, a "wiki" docbook editor would be a must have (not mediawiki, by the way) - an online docbook editor with syntax checking... but I don't either dream of it :-( jdd -- http://www.dodin.net Cécile, esthéticienne à Montpellier (à domicile) http://gourmandises.orangeblog.fr/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
On Apr 29, 2007 04:14 PM, Christian Boltz <opensuse@cboltz.de> wrote:
I don't think so. The reason why more people use OpenOffice is WYSIWYG. And because you can "design" your documents easier (well, usually this shouldn't be done at all, but people like to do it ;-)
Maybe. It isn't only about writing, it is also about reading. I probably do not want to read a book done using WYSIWYG technologies ;)
Exactly this is the point - with some wiki templates, we could have fully _automatic_ conversion to XML.
This would also mean easier editing (no SVN knownledge needed, and it's nearly impossible to have syntax errors in mediawiki ;-)
We could even maintain LfL in a wiki which would mean new authors wouldn't need to learn docbook. Maybe this would also attract more authors to write for LfL...
Just compare the numbers - how many wiki editors does openSUSE have? And how many LfL editors?
Less is more. The goal is not to write as much text as possible. The goal--if I got it right--is to write a book. You surely do not want to maintain a book (200 pages and more) in a wiki. It would either end as a rather bad book or become sooner or later a nightmare to maintain. I already wee you writing "bots" to accomplish simple search-and-replace opperations. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-doc+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-doc+help@opensuse.org
participants (7)
-
Berthold Gunreben
-
Christian Boltz
-
Clayton
-
Jana Jaeger
-
jdd
-
Karl Eichwalder
-
Rajko M.