Search code examples
xmlapachepdfapache-fopdocbook

What is the recommended toolchain for formatting XML DocBook?


I've seen Best tools for working with DocBook XML documents, but my question is slightly different. Which is the currently recommended formatting toolchain - as opposed to editing tool - for XML DocBook?

In Eric Raymond's 'The Art of Unix Programming' from 2003 (an excellent book!), the suggestion is XML-FO (XML Formatting Objects), but I've since seen suggestions here that indicated that XML-FO is no longer under development (though I can no longer find that question on StackOverflow, so maybe it was erroneous).

Assume I'm primarily interested in Unix/Linux (including MacOS X), but I wouldn't automatically ignore Windows-only solutions.

Is Apache's FOP the best way to go? Are there any alternatives?


Solution

  • I've been doing some manual writing with DocBook, under cygwin, to produce One Page HTML, Many Pages HTML, CHM and PDF.

    I installed the following:

    1. The docbook stylesheets (xsl) repository.
    2. xmllint, to test if the xml is correct.
    3. xsltproc, to process the xml with the stylesheets.
    4. Apache's fop, to produce PDF's.I make sure to add the installed folder to the PATH.
    5. Microsoft's HTML Help Workshop, to produce CHM's. I make sure to add the installed folder to the PATH.

    Edit: In the below code I'm using more than the 2 files. If someone wants a cleaned up version of the scripts and the folder structure, please contact me: guscarreno (squiggly/at) googlemail (period/dot) com

    I then use a configure.in:

    AC_INIT(Makefile.in)
    
    FOP=fop.sh
    HHC=hhc
    XSLTPROC=xsltproc
    
    AC_ARG_WITH(fop, [  --with-fop  Where to find Apache FOP],
    [
        if test "x$withval" != "xno"; then
            FOP="$withval"
        fi
    ]
    )
    AC_PATH_PROG(FOP,  $FOP)
    
    AC_ARG_WITH(hhc, [  --with-hhc  Where to find Microsoft Help Compiler],
    [
        if test "x$withval" != "xno"; then
            HHC="$withval"
        fi
    ]
    )
    AC_PATH_PROG(HHC,  $HHC)
    
    AC_ARG_WITH(xsltproc, [  --with-xsltproc  Where to find xsltproc],
    [
        if test "x$withval" != "xno"; then
            XSLTPROC="$withval"
        fi
    ]
    )
    AC_PATH_PROG(XSLTPROC,  $XSLTPROC)
    
    AC_SUBST(FOP)
    AC_SUBST(HHC)
    AC_SUBST(XSLTPROC)
    
    HERE=`pwd`
    AC_SUBST(HERE)
    AC_OUTPUT(Makefile)
    
    cat > config.nice <<EOT
    #!/bin/sh
    ./configure \
        --with-fop='$FOP' \
        --with-hhc='$HHC' \
        --with-xsltproc='$XSLTPROC' \
    
    EOT
    chmod +x config.nice
    

    and a Makefile.in:

    FOP=@FOP@
    HHC=@HHC@
    XSLTPROC=@XSLTPROC@
    HERE=@HERE@
    
    # Subdirs that contain docs
    DOCS=appendixes chapters reference 
    
    XML_CATALOG_FILES=./build/docbook-xsl-1.71.0/catalog.xml
    export XML_CATALOG_FILES
    
    all:    entities.ent manual.xml html
    
    clean:
    @echo -e "\n=== Cleaning\n"
    @-rm -f html/*.html html/HTML.manifest pdf/* chm/*.html chm/*.hhp chm/*.hhc chm/*.chm entities.ent .ent
    @echo -e "Done.\n"
    
    dist-clean:
    @echo -e "\n=== Restoring defaults\n"
    @-rm -rf .ent autom4te.cache config.* configure Makefile html/*.html html/HTML.manifest pdf/* chm/*.html chm/*.hhp chm/*.hhc chm/*.chm build/docbook-xsl-1.71.0
    @echo -e "Done.\n"
    
    entities.ent: ./build/mkentities.sh $(DOCS)
    @echo -e "\n=== Creating entities\n"
    @./build/mkentities.sh $(DOCS) > .ent
    @if [ ! -f entities.ent ] || [ ! cmp entities.ent .ent ]; then mv .ent entities.ent ; fi
    @echo -e "Done.\n"
    
    # Build the docs in chm format
    
    chm:    chm/htmlhelp.hpp
    @echo -e "\n=== Creating CHM\n"
    @echo logo.png >> chm/htmlhelp.hhp
    @echo arrow.gif >> chm/htmlhelp.hhp
    @-cd chm && "$(HHC)" htmlhelp.hhp
    @echo -e "Done.\n"
    
    chm/htmlhelp.hpp: entities.ent build/docbook-xsl manual.xml build/chm.xsl
    @echo -e "\n=== Creating input for CHM\n"
    @"$(XSLTPROC)" --output ./chm/index.html ./build/chm.xsl manual.xml
    
    # Build the docs in HTML format
    
    html: html/index.html
    
    html/index.html: entities.ent build/docbook-xsl manual.xml build/html.xsl
    @echo -e "\n=== Creating HTML\n"
    @"$(XSLTPROC)" --output ./html/index.html ./build/html.xsl manual.xml
    @echo -e "Done.\n"
    
    # Build the docs in PDF format
    
    pdf:    pdf/manual.fo
    @echo -e "\n=== Creating PDF\n"
    @"$(FOP)" ./pdf/manual.fo ./pdf/manual.pdf
    @echo -e "Done.\n"
    
    pdf/manual.fo: entities.ent build/docbook-xsl manual.xml build/pdf.xsl
    @echo -e "\n=== Creating input for PDF\n"
    @"$(XSLTPROC)" --output ./pdf/manual.fo ./build/pdf.xsl manual.xml
    
    check: manual.xml
    @echo -e "\n=== Checking correctness of manual\n"
    @xmllint --valid --noout --postvalid manual.xml
    @echo -e "Done.\n"
    
    # need to touch the dir because the timestamp in the tarball
    # is older than that of the tarball :)
    build/docbook-xsl: build/docbook-xsl-1.71.0.tar.gz
    @echo -e "\n=== Un-taring docbook-xsl\n"
    @cd build && tar xzf docbook-xsl-1.71.0.tar.gz && touch docbook-xsl-1.71.0
    

    to automate the production of the above mentioned file outputs.

    I prefer to use a nix approach to the scripting just because the toolset is more easy to find and use, not to mention easier to chain.