doctransformer: DocBook XML transformation for printing


Table of Contents

Intro
News
Technology
Status
License
Downloads
Help and Contributions
FAQs
Related links
DocBook XML editors
DocBook Howtos
Misc DocBook Site
Alternatives to doctransformer
Integrated Tool Chain Distributions (SGML/XML)
Alternatives to DocBook
A. doctransformer status tag by tag
B. docbook figure test

doctransformer is a Java based tool for transforming documents written in DocBook XML to high level (batch mode) type setting languages like Lout and LaTeX. It thus enables you to convert your DocBook XML written documents into a printable Postscript and/or PDF document. doctransformer can be used as part of a DocBook tool chain as a replacement for the openJade/jadeTeX or the DocBook-XSLT/XSLT-processor/FO-engine combination.

Internally, doctransformer uses a XML Catalog enabled SAX Parser for XML processing augmented with look-ahead and multi ContentHandler abilities. DocBook XML 4.2 is directly transformed into the output language, there is no intermediate FO transformation. XSLT technology is not used as well.

At the moment, only a Lout back end is present. Writing a LaTeX back end should be easy. This is beta software. Only a small subset of DocBook tags are supported.

Supported features:

You find the following license a little strange at first. But its purpose is to

That said here it is:

doctransformer is published and distributed under the terms of the QT Non Commercial license version 1.0. A copy of this license is distributed with all source code and/or binary packages of doctransformer. You can also get a copy of this license at http://www.trolltech.com/developer/licensing/noncomm.html.

Please note that the copyright owner of this software, Thomas Pasch <thomas.pasch _at_ gmx.de> has more rights on this software than granted by the license. If you want to use this software in a commercial setting please contact the copyright owner for licensing details.

If you contribute to this software you have to do that conforming to the license. For contributions, terms of the Q Public license version 1.0 (http://www.trolltech.com/developer/licensing/qpl.html) are applied in addition to the terms of QT Non Commercial license. Initial developer of the software is the copyright owner.

In the case of substantial contributions the copyright owner will grant you and your organization the right to use this software in a commercial setting but only for the next (improved) version of the software with your contributed patches applied. This right will be granted on request and in written form. The decision if the contribution is substantial will be made by the copyright owner.

At the moment, you can grap doctransformer from the CVS only. There will be a beta release this spring that you will be able to get from here.

For using doctransformer, you also need the Lout Text Processing System. The source can be compiled on virtually all platforms. And you could also look for binaries here and here.

There is an open discussion forum on doctransformer where you can ask your questions. This is the right place to go to if you want to participate as well.

Project activity can be track here.

1.1. What is the difference between DocBook XML and DocBook SGML?
1.2. There are so many DocBook tools. Do you think doctransformer is necessary?
1.3. When there will be an LaTeX backend?
1.4. Will there be an (X)HTML backend?
1.1. What is the difference between DocBook XML and DocBook SGML?
1.2. There are so many DocBook tools. Do you think doctransformer is necessary?
1.3. When there will be an LaTeX backend?
1.4. Will there be an (X)HTML backend?
1.1.

What is the difference between DocBook XML and DocBook SGML?

SGML is a pre-XML Standard. It is much more complex. Rendering is done with DSSSL Stylesheets. The best tested tool chain (openJade/jadeTeX) is an SGML chain (but it can render XML as well). Look here for a complete discussion. Generally, both SGML and XML use DTDs, and SGML tools can render XML but not vice versa.

1.2.

There are so many DocBook tools. Do you think doctransformer is necessary?

The best tested tools are written to the old SGML standard. Nearly all XML tools use intermediate XSL-FO transformation for printable documents. All FO tools ignore the typsetting wisdom (ligatures, paragraph and page fullness compensation, flowing tables and figures, book-typed equation quality) incooparated in old-style batch text processing tools like Lout and LateX.

Last not least I don't like XSLT Stylesheet processing.

1.3.

When there will be an LaTeX backend?

As soon as you write one.

1.4.

Will there be an (X)HTML backend?

I don't think that this is necessary. The Docbook XSL Stylesheets are doing a good job for rendering (X)HTML.

I'm collecting important links to DocBook related standards, tools and documentation here. If you think that something is missing, just drop me an eMail so that I can complete this list. Thanks.

You may write DocBook with every text editor. In principle. In real live however, writing DocBook with a DocBook enabled tool is much more fun. So try one of these:

DocGen http://sourceforge.net/projects/gendiapo

A java-based open source (GPLed) editor for structured documents based on the MerlotXML sources (that are continued as Xerlin). This is a very simple but useful editor.

Still alpha software, and earlier known as GenDiapo. Good home page with many useful links.

LyX http://www.lyx.org

LyX is a open source (GPLed) editor for structured documents written in C++. Primary output target is LaTeX. However, LyX can be used as SGML (not really XML) editor for DocBook on Unix machines with X11. Select the DocBook layout for this (Layout->Document->Class->DocBook {article, book, chapter, section} SGML).

See also Section “Alternatives to DocBook”.

XMLMind XML Editor http://www.xmlmind.com/xmleditor

A great and easy to use java-based XML editor especially suited to edit structured documents like DocBook XML. My favorite choice. The DTD-only version is free-to-use and prepared to edit DocBook. A commercial version with XML Schema support included is available as well.

OK, you wanna start with DocBook. This is a collection of links to help you.

Goosens: Writing Documentation using DocBook http://xml.web.cern.ch/XML/goossens/dbatcern/

The classic primer. No words about converting DocBook to any usable format, however.

Brockmeier: A gentle guide to DocBook http://www-106.ibm.com/developerworks/library/l-docbk.html

Very short DocBook overview. Good starting point if you don't understand nothing here.

Waugh: SelfDocBook http://cyberelk.net/tim/docbook/selfdocbook/selfdocbook.html

A self-contained example of building DocBook documentation. Includes a small CSS custumization example as well.

Zoppi: Exploring SGML DocBook http://lwn.net/2000/features/DocBook/

A short get-ready-to-use introduction to the OpenJade/JadeTeX tool chain.

Linux Documentation Project: Authors - Contribute http://www.tldp.org/authors/

The Linux Documentation Project is using DocBook as well and has a list of very useful links.

Midgard Documentation Howto http://www.midgard-project.org/developer/dochowto/index.html

As the Midgard documentation is written in DocBook this is very recommended site as well.

Godoy: DocBook Howto http://www.ibiblio.org/godoy/sgml/docbook/howto/

Another DocBook tutorial.

Galassi: Get Going With DocBook http://www.ibiblio.org/godoy/sgml/docbook/howto/

An rather outdated tutorial from 1998.

Guillon: DocBook with LyX http://bgu.chez.tiscali.fr/doc/db4lyx/

LyX will give you SGML as output if you want it to. To render SGML it for printing you could use doctransformer as well as jade/jadetex.

Hoenicka: SGML for NT http://ourworld.compuserve.com/homepages/hoenicka_markus/ntsgml.html

Howto about setting up a SGML tool chain on Microsoft systems. But anyway, very useful for everyone who is trying to get a SGML tool chain to work.

You want to very know it or trying to become a DocBook guru. This way, please.

DocBook Book Home Page http://www.docbook.org

The ultimate DocBook tags and entities description. Very overwhelming for DocBook users but very useful for implementors.

DocBook DTD http://www.oasis-open.org/committees/docbook/

The DTD definition of DocBook. Norman Walsh has also published the DocBook as RELAX NG Schema and as W3C XML Schema (both experimental).

DocBook FAQ http://www.dpawson.co.uk/docbook/

Dava Pawson's about his FAQ: “The initial focus will be on the XML version of the DTD, and the XSLT based stylesheets. Over time I may add faq's for SGML and DSSSL. I will need help there though!”

Simplified DocBook DTD http://www.oasis-open.org/committees/docbook/xml/simple/index.shtml

A 100% legal subset of DocBook for beginners and all other who are overwhelmed by more than 400 tags.

MathML http://www.w3.org/Math/

MathML is sometimes used in conjunction with DocBook to provide tags for expressing formulas and equation in scientific documents. Some tools like PassiveTex have build-in MathML support.

CSS http://www.w3.org/Style/CSS/

Cascading Stylesheet can be used in conjunction with (X)HTML (converted by XSL Stylesheets from DocBook) to customize the browser look and feel. Look here for books and tutorials. If you want to know all read the CSS1 and CSS2 specification.

There are many ways to render a DocBook document. If you don't want to use doctransformer, you can also choose between:

XSLT Stylesheets http://sourceforge.net/projects/docbook

Stylesheets for DocBook XML to FO, manpages, JavaHelp and (X)HTML transformation. There is good documentation by Norman Walsh et al. included. But you may also find Robert Stayton's documentation "Using the DocBook XSL Stylesheets" at http://www.sagehill.net/xml/docbookxsl/ very useful. The doctransformer XHTML pages are rendered by this.

OpenJade http://sourceforge.net/projects/openjade

A extended version of James Clark's Jade DSSSL implementation. You may find Saqib Ali's documentation "DocBook XML/SGML processing using OpenJade" at http://www.tldp.org/HOWTO/DocBook-OpenJade-SGML-XML-HOWTO/ very useful for using it. This program allows conversion to RTF, TeX, MIF (FrameMaker), HTML, and FO. For PostScript and PDF output, you also need JadeTeX from http://sourceforge.net/projects/jadetex.

DocBook in ConTeXt http://www.miwie.org/db-context/

Macro package for ConTeXt (a TeX extension) DocBook rendering. This is beta software, but looks very promising. See also http://www.hobby.nl/~scaprea/context/index.html.

PassiveTex http://www.tei-c.org.uk/Software/passivetex/

Macro package for TeX. Used to transform FO to PostScript and PDF output. Build-in MathML support. A little disappointing compared to the jade/jadetex tool chain.

FOP http://xml.apache.org/fop/index.html

Java-based tool for transforming FO to PDF, PCL, PostScript, SVG, Print, AWT, MIF (FrameMaker), and TXT. Works reasonable, but this is a rather simple layouter compared to solution based on TeX and extensions.

DB2LaTeX http://sourceforge.net/projects/db2latex

XSLT Stylesheets for transforming DocBook XML directly to LaTeX. This project is dead or very slowly moving.

First thing before rendering DocBook is to get a DocBook tool chain in place. If you use a (Linux) distribution, there should be such a thing somewhere (but maybe you have to install the packages). If everything fails, try one of these:

SGMLTools-lite http://sourceforge.net/projects/sgmltools-lite/

Classic collection of tools for SGML processing. This project is dead or very slowly moving.

DocBook Tools http://sources.redhat.com/docbook-tools/

The Redhat collection of a DocBook tool chain. This project is dead or very slowly moving.

DocBook SGML ToolBox http://www.gemini1consulting.com/tekhelp/

A collection of tools to start with SGML editing and processing.

xmlto http://cyberelk.net/tim/xmlto/

Front-end to a XSL tool chain. Uses PassiveTeX and libxslt.

XSLT-process http://xslt-process.sourceforge.net/

A (minor) emacs mode and some tools to provide a SGML and DocBook tool chain. See also PSGML at http://www.lysator.liu.se/~lenst/about_psgml/. As an alternative you could use Norman Walsh DocBook IDE for emacs.

DocBook is not exactly what you're looking for? Perhaps your solution is listed here:

LyX http://www.lyx.org

LyX is a open source (GPLed) editor for structured documents written in C++. Primary output target is LaTeX.

TeXmacs http://www.texmacs.org/

Open source (GPLed) scientific editor and renderer for Unix machines with X11. Not based on Tex (but on MetaFont). Great equation and formula support.

TBook Markup Language http://sourceforge.net/projects/tbookdtd

An alternative DTD and some tools. Purpose is similar to DocBook. TBook can be converted to DocBook (but not vice versa).

TEI Markup Language http://www.tei-c.org.uk/Software/

An alternative DTD and some tools. Purpose is similar to DocBook.

A. doctransformer status tag by tag

Tags not mentioned are probably not supported...

B. docbook figure test

Above an formal figure (jpg/eps). Below an informal figure (png/gif).

This is a docbook included image test. First try a inline graphic. This is a docbook included image test. First try a inline graphic. This is a docbook included image test. First try a inline graphic. Tiger
    tif/eps This is a docbook included image test. First try a inline graphic. This is a docbook included image test. First try a inline graphic. This is a docbook included image test. First try a inline graphic.

doctransformer is kindly hosted on http://doctransformer.sourceforge.net.

Last modified 2003-01-14 by Thomas Pasch.