From HDF5 Wiki
Jump to navigation Jump to search

On this page, we provide a summary of the considerations and the discussion that led to the decision to use Doxygen as the basis of the HDF5 API reference documentation. Traditionally, this part of the HDF5 documentation is referred to as the HDF5 reference manual or HDF5 RM, a term which we adopt throughout this discussion.


(In no particular order) The following roles have an interest or concern in an high-quality HDF5 API reference.

  • In-house developers
  • Application developers
  • (Non-HDF5) Library developers
  • Maintainers of HDF5 language bindings, modules, and tools
  • Integrators
  • Data life cycle managers
  • Sponsors


The list of requirements reflects the needs of HDF5 RM documentation producers and consumers.

Functional Requirements Non-Functional Requirements
  1. HDF5 RM shall support full-text and categorized searches as well as rich navigational structures
  2. HDF5 RM shall be available standalone, online and offline
  3. Online versions of HDF5 RM shall be
    1. Accessible by outside search engines (SEO)
    2. Compliant with ADA, WCAG 2.1, and section 508
    3. Easily printable and exported to formats such as PDF
    4. Mobile-friendly (render well on mobile devices such as cell phone and tablets)
  4. HDF5 RM shall support a variety of content including
    1. Code (syntax highlighting)
      • C
      • Fortran
      • ...
    2. Examples
    3. Equations
    4. Images
  5. An HDF5 RM implementation shall support content reuse and consistency through mechanisms such as templates and transclusion
  1. HDF5 RM deployments offline or online shall be responsive
  2. HDF5 RM shall support and encourage community contributions
    1. The contribution process shall be aligned with that for source code
      1. Authentication
      2. Authorization
      3. Review
      4. Versioning
  3. HDF5 RM shall be implemented using free and open-source software (FOSS)
  4. The generation of HDF5 RM instances shall be automated and rely on a single source.
    1. The HDF5 RM source shall reside "as close as possible" to the API it documents.
    2. API examples shall be fully functional and tested alongside the main library.

Technical Solution Candidates

There are plenty of solution candidates from which we can choose. A selection can be seen below.

We had neither the time nor was it necessary to explore all of them in-depth. Based on internal and community discussions, we took a closer look at two candidates representing two prevailing documentation philosophies. The candidates we chose were MediaWiki and Doxygen. Both are mature technologies and used widely. We looked at MediaWiki also with an eye toward other types of HDF5 documentation, i.e., beyond the HDF5 RM.


The following matrix represents an attempt at a somewhat objective comparison of both candidates' merits.

Functional Requirements

Doxygen MediaWiki Comments
FR1 8/10 10/10 Both candidates support full-text and categorized searches, as well as rich navigational structures. Doxygen has more navigational structures for its built-in categories, such as modules, files, etc. MediaWiki has fewer predefined API-specific categories and search structures but can be easily extended via categories, tags, semantic search, etc., beyond anything Doxygen has to offer.
FR2 10/10 10/10 Doxygen generates static HTML content out-of-the-box. It's also straightforward to create a static HTML dump of a MediaWiki site.
FR3.1 10/10 10/10 Both candidates work well with external search engines with MediaWiki having a slight edge with, for example, Google search.
FR3.2 5/10 5/10 Both candidates fail several compliance checks with their out-of-the-box configurations and require substantial customization.
FR3.3 10/10 7/10 It's easy to create various document formats directly from the source with Doxygen. Partly due to the page-oriented nature of MediaWiki, this is not as straightforward and the results depend on downstream "plumbing."
FR3.4 7/10 9/10 The default MediaWiki configuration renders well on mobile devices but can be improved. Doxygen looks dreadful on mobile devices with the defaults and requires quite a bit more effort.
FR4.1 10/10 9/10 Both render source code well. The automatic link generation for user-defined types gives Doxygen an edge.
FR4.2 10/10 9/10 Both support the inclusion and highlighting of examples (external) examples. The automatic link generation for user-defined types again gives Doxygen a slight edge.
FR4.3 FR4.4 10/10 10/10 Both support equations based on TeX and standard image formats.
FR5 8/10 10/10 Doxygen's support for templates and transclusion is a far cry from MediaWiki's capabilities, but adequate for HDF5 RM.
FR 88/110 89/110 Neck and neck

Non-Functional Requirements

Doxygen MediaWiki Comments
NFR1 10/10 9/10 It's hard to beat Doxygen's static HTML. Since MediaWiki is also an authoring tool the dynamic PHP, MySQL stack is as fast as it can be.
NFR2 8/10 10/10 MediaWiki has an edge because it can attract non-developer contributors and has the authoring tools built-in.
NFR2.1 10/10 3/10 MediaWiki's process is weakly aligned with the source code.
NFR2.1.1 NFR2.1.2 NFR2.1.3 NFR2.1.4 10/10 7/10 MediaWiki supports Single-Sign-On, but this needs to be configured separately as well as the role and group management. The review process needs to be configured separately as well, while support for versioning (per page!) is built-in. Doxygen has the edge because it's processes become part of the source code management.
NFR3 10/10 10/10 Both tools are available as FOSS.
NFR4 10/10 8/10 The single source requirement is satisfied in both cases, but a custom export pipeline is needed for MediaWiki.
NFR4.1 10/10 4/10 With a lot of customization, the MediaWiki source could be maintained close to the HDF5 source code, but the result would be brittle and destroy the convenient MediaWiki authoring experience.
NFR4.2 10/10 10/10 MediaWiki can easily transclude GitHub content such as code examples.
NFR 78/80 61/80 Doxygen has a clear edge in the non-functional requirements department.


Despite their differences, both solutions have substantial overlap to the point that, considering functional requirements only, either solution would work well. The greater suitability of Doxygen for HDF5 RM clearly stands out when considering the non-functional requirements. The choice is clear.


Basic Questions

Developing Quality Technical Information: A Handbook for Writers and Editors, 3rd Edition

  1. What is the RM's purpose?
    At its core, the RM presents facts about HDF5 that support tasks. For example, such facts include API function syntax and restrictions, command-line options of tools, and technical specifications. Other than to "impress," its purpose is not to sell a product.
  2. Who is the RM's audience?
    According to its purpose, anyone who is given a task that requires them to locate and understand factual information about HDF5 is part of its audience.
  3. How do we assess the quality of technical documentation in general?
Easy to use Easy to understand Easy to find
  • Task orientation
  • Accuracy
  • Completeness
  • Clarity
  • Concreteness
  • Style
  • Organization
  • Retrievability
  • Visual effectiveness

What about Docs Like Code (DLC)?

The main reference for this section is Docs Like Code by Anne Gentle (2017).

Definition Goals
  • Store the doc source files in a version control system
  • Build the doc artifacts automatically
  • Ensure that a trusted set of reviewers meticulously reviews the docs
  • Publish the artifacts without much human intervention

These techniques constitute a docs-like-code framework.

  1. Promote collaboration
  2. Get long-tail contributions
  3. Track doc bugs like code bugs
  4. Get better reviews
  5. Make beautiful docs
  6. Use developer tools and workflows
  7. Get value from cost-effective tools



Attribute Wiki GitHub
Usability Many developers never leave their Terminal window and do not want to open a browser window to log in and edit a wiki page that they must find in the first place. You can avoid the context switch from coding to opening a wiki page by placing the docs directly in the code or in the same GitHub repo as the code.
Security Sometimes wikis are seen as ideal internal-only documentation sites because the wiki is accessible behind a firewall, for example. GitHub Pages from https://github.com are publicly available when published, even when published from a private repo. You can work around this issue by implementing an authentication workflow on a site that you upload to GitHub Pages, but you must have the web development resources to maintain the login requirement. You could use GitHub Enterprise Pages, which requires a VPN connection to view pages.
Statistics Some wiki engines give you statistics that help you determine who is a topic expert. Others may only provide statistics to users with certain permissions. GitHub enables you to see which contributors are working on a particular part of the software project, which can help with documentation needs.
Automated builds Many wikis only provide a simple Save button per page. With additional plugins for Confluence, for example, you can automate parts of the publishing process and put a workflow for reviews in place. Treating docs like code lets you automate more tasks than simply rendering and publishing HTML. Automated builds for review purposes are a big advantage of treating docs like code.
Review workflow Wikis often have add-ons for review workflows, but when the heart of a wiki is quick editing, people do not switch easily to the review workflow. In GitHub, it's expected to let others review your changes before merging, or publishing in the case of a document.
Web design If your web development resources know a particular wiki framework really well, they might be able to create a nicer end-user experience for a wiki than in a highly flexible web framework like Sphinx or Jekyll. When publishing an entire site using static site generators, you get more flexibility in choosing web designs and navigation than with most heavily opinionated wiki frameworks.
Reuse Wikis are page-oriented, and to do releases, you might need to publish a page twice. Over time in a wiki, contributors create new pages instead of looking for and editing existing pages, and trusted content becomes harder to find. Sprawl can be a problem with both solutions, but the organized nature of having documentation near code seems to keep the two in lock-step when they share systems such as version control and review workflows. With GitHub, you can back port a change from one release to another by using the same pull request workflow.


Lessons Learned

People wonder whether treating docs like code can work in their environment. Although outcomes matter and this movement aims to exceed users' expectations, the disruptions in your team's daily tasks, work expectations, and areas of control can make this transition difficult.

As this guide has shown, the shift to treating docs like code includes a complex overhaul of attitudes, processes, toolsets, and expectations. However, many teams work through these difficulties to great rewards. To ease the transition to using more docs-like-code techniques, look for these opportunities.

  • Create a great web experience
  • Equip your contributors with a style guide
  • Empower your contributors
  • Write a contributor's guide
  • Build in continuous integration for docs
  • Teach everyone to respect the docs
  • Test and measure outcomes