Sunday, February 15, 2009

Software Documentation with Docbook, FOP, and Ant

For a long time I had been looking for a standard way to create conceptual documentation for software projects. It took some experimentation, but I think I've finally found an approach that I like. Here's how I got there.
  • Wiki - Wiki definitely serves the purpose of quick collaborative documentation creation, however over time I found myself getting in to trouble. A description of an API or XML Format would change, and I would update the wiki accordingly. Unfortunately the feature I was working on hadn't made it to production yet, so the users of the wiki documentation would then be reading documentation that was ahead of its time, resulting in mass confusion.
  • Source-Controlled HTML Documentation - This approach solves for the versioning issues noted with the Wiki approach by adding the documentation to version control right next to the source code. Unfortunately there was about 30 different ways I could create several paragraphs of description mixed in with code snippets. Add in a little styling to the mix, and the HTML source quickly becomes unreadable.
  • Semantic XML Language that generates HTML - This approach fixes the unreadable HTML source issue, and allows for the documentation to live with the source code in version control. The only issue remaining issue is what is possible confusion over where to document. For example, if a new developer on a project using this approach checkes out, and builds, he or she may not immediately realize that the generated HTML documentation is in fact generated. Not a huge issue, and there are ways around this, but still something that is within the realm of possiblity.
Finally I arrived at the magic combination of Docbook, FOP, and Ant, with the docbook source living in version control. Using an output representation that is essentially a read-only format removed any confusion over what is generated, and what is not. Let me describe the tools and document types involved.

Docbook is a nice semantic XML markup language convenient for technical documentation. It can be transformed into a variety of formats varying from PDF to HTML. There's a decent reference guide here. The source looks something like this.
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.1//EN">

<article>
<articleinfo>
<title>An Example Article</title>

<author>
<firstname>Your first name</firstname>
<surname>Your surname</surname>
<affiliation>
<address><email>foo@example.com</email></address>
</affiliation>
</author>

<copyright>
<year>2000</year>
<holder>Copyright string here</holder>
</copyright>

<abstract>
<para>If your article has an abstract then it should go here.</para>
</abstract>
</articleinfo>

<sect1>
<title>My First Section</title>
<para>This is the first section in my article.</para>
<sect2>
<title>My First Sub-Section</title>
<para>This is the first sub-section in my article.</para>
</sect2>
</sect1>
</article>


Apache FOP (Formatting Objects Processor) is a print formatter driven by XSL formatting objects (XSL-FO) and an output independent formatter.

Apache Ant is a Java-based build tool. I use Ant for nearly all of Java and Non-Java development projects.



The build script for the project has the job of generating the documentation as a PDF. To do this, two steps are required. First the Docbook FO XSL file is applied to your XML documentation file in Docbook format which generates a FO document. Second, Apache FOP's Ant task then generates the PDF from the FO document. It's that simple. Your Ant target will end up looking something like this:

<target name="doc" depends="doc.upToDate" unless="doc.notRequired">

<!-- Create the FO doc -->
<xslt basedir="doc" includes="MyProject.xml" style="${build.docbook.dir}/docbook-xsl-1.73.2/fo/docbook.xsl"
destdir="${build.documentation}" extension="-fo.xml">
<factory name="org.apache.xalan.processor.TransformerFactoryImpl">
<attribute name="http://xml.apache.org/xalan/features/optimize" value="true"/>
</factory>
<param name="body.font.size" expression="8pt"/>
</xslt>


<!-- Create the PDF -->
<property name="fop.home" location="bin/docbook/fop-0.94"/>
<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop">
<classpath>
<fileset dir="${fop.home}/lib">
<include name="*.jar"/>
</fileset>
<fileset dir="${fop.home}/build">
<include name="fop.jar"/>
<include name="fop-hyph.jar" />
</fileset>
</classpath>
</taskdef>
<fop format="application/pdf"
basedir="doc"
fofile="${build.documentation}/MyProject-fo.xml"
outfile="${build.documentation}/MyProject.pdf" />

<touch file="${doc.timestamp.file}"/>

</target>