Skip to content

Conversation

@mawiesne
Copy link
Contributor

@mawiesne mawiesne commented Dec 27, 2025

What does this PR do?

OPENNLP-1792: Add docbkx plugin config to produce PDF of dev manual

  • modernizes xml files of the OpenNLP dev manual to use Docbook dtd in version 5.x
  • adds plugin configuration to produce the PDF as separate outcome
  • adds xsl file for customization of the layout of text in the PDF
  • adds hyphenation support for PDF version
  • introduces specific profiles for html, pdf, and epub3 targets of the dev manual
  • fixes inconsistencies in dev manual's structure (duplicate ids...)
  • fixes 'informaltable' column counts, mainly in cli.xml
  • fixes missing inlinemediaobject for parsetree1.png image include causing trouble during epub3 generation
  • adds config for pdf and epub file inclusion to bin.xml config in opennlp-distr/assembly

How to reproduce on local machine

  • switch to the branch related to this PR
  • cd opennlp-docs
  • mvn clean verify -Pdocs-manual-html -Pdocs-manual-pdf -Pdocs-manual-epub
  • inspect target/docbkx for sub directories pdf, html, and separate opennlp.epub file.
  • enjoy and provide feedback

Tasks

Thank you for contributing to Apache OpenNLP.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible.

@mawiesne mawiesne self-assigned this Dec 27, 2025
@mawiesne mawiesne added the documentation Pull requests that update documentation label Dec 27, 2025
@mawiesne mawiesne force-pushed the OPENNLP-1792-Add-docbkx-plugin-config-to-produce-PDF-of-dev-manual branch 4 times, most recently from 35f99d7 to 9e02e12 Compare December 29, 2025 18:11
@mawiesne mawiesne force-pushed the OPENNLP-1792-Add-docbkx-plugin-config-to-produce-PDF-of-dev-manual branch 4 times, most recently from e440c2a to e7ae840 Compare January 18, 2026 14:01
- modernizes xml files of the OpenNLP dev manual to use Docbook dtd in version 5.x
- adds plugin configuration to produce the PDF as separate outcome
- adds xsl file for customization of the layout of text in the PDF
- adds hyphenation support for PDF and epub versions
- introduces specific profiles for html, pdf, and epub3 targets of the dev manual
- adds cover entry to opennlp.xml for epub target
- fixes inconsistencies in dev manual's structure (duplicate ids...)
- fixes 'informaltable' column counts, mainly in cli.xml
- fixes missing inlinemediaobject for parsetree1.png image include causing trouble during epub3 generation
- adds config for pdf and epub file inclusion to bin.xml config in opennlp-distr/assembly
@mawiesne mawiesne force-pushed the OPENNLP-1792-Add-docbkx-plugin-config-to-produce-PDF-of-dev-manual branch from e7ae840 to 50461bf Compare January 18, 2026 14:09
@mawiesne mawiesne marked this pull request as ready for review January 18, 2026 14:09
@mawiesne
Copy link
Contributor Author

mawiesne commented Jan 18, 2026

Note: I ran checks for the resulting epub file (German output)

Mac:docbkx mwiesner$ epubcheck opennlp.epub
Verwendung der EPUB 3.3 Prüfungen
Das EPUB enthält keine Fehler oder Warnungen.
Es ist valide.
Meldungen: 0 Schwerwiegende Fehler / 0 Fehler / 0 Warnungen / 0 Informationen

EPUBCheck abgeschlossen

So it seems, epubcheck is telling us, we create a valid EBook for the OpenNLP dev manual, including a cover image.

- modernizes xml files of the OpenNLP dev manual to use Docbook dtd in version 5.x
- adds plugin configuration to produce the PDF as separate outcome
- adds xsl file for customization of the layout of text in the PDF
- adds hyphenation support for PDF and epub versions
- introduces specific profiles for html, pdf, and epub3 targets of the dev manual
- adds cover entry to opennlp.xml for epub target
- fixes inconsistencies in dev manual's structure (duplicate ids...)
- fixes 'informaltable' column counts, mainly in cli.xml
- fixes missing inlinemediaobject for parsetree1.png image include causing trouble during epub3 generation
- adds config for pdf and epub file inclusion to bin.xml config in opennlp-distr/assembly
Copy link
Contributor

@atarora atarora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !
Thanks !

@mawiesne mawiesne merged commit 58d91f0 into main Jan 20, 2026
12 checks passed
@mawiesne mawiesne deleted the OPENNLP-1792-Add-docbkx-plugin-config-to-produce-PDF-of-dev-manual branch January 20, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Pull requests that update documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants