da_build: add PDF/UA checker with severity classification and optional strict mode#86
Open
nonprofittechy wants to merge 2 commits intomainfrom
Open
da_build: add PDF/UA checker with severity classification and optional strict mode#86nonprofittechy wants to merge 2 commits intomainfrom
nonprofittechy wants to merge 2 commits intomainfrom
Conversation
Rules are now classified into four levels based on their real-world impact: - **fail**: structural failures that break screen readers (missing StructTreeRoot, untagged content, figures without alt text, missing font/ToUnicode, etc.) - **warning**: advisory issues that don't break AT but should be fixed (missing dc:title, missing document language, missing DisplayDocTitle, etc.) - **info**: administrative metadata rules suppressed by default (§5 PDF/UA identifier, optional-content config, PrinterMark annotations) - **form_annotation**: tab-order (§7.18.3) and widget annotation structure (§7.18.4) rules — suppressed in non-strict mode because forms are often flattened before users see them; treated as failures in strict mode New input: `verapdf-strict` (default `false`). Set to `true` to activate form-annotation structure rules. Job summary now groups results into failure / advisory-warning / passing sections, with advisory warnings and suppressed rules in collapsible details. Console output shows a per-PDF breakdown (N failure(s), N warning(s), N suppressed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds automated PDF/UA-1 accessibility validation to the da_build composite GitHub Action using veraPDF, with rule severity classification and an optional strict mode to control how form/tab-order related rules are treated.
Changes:
- Add
check_pdf_accessibility.pyto run veraPDF, classify rule severities, emit annotations, and write a GitHub Step Summary report. - Extend
da_build/action.ymlwith new inputs (verapdf-validation-mode,verapdf-strict), install veraPDF, and run the checker. - Document the new PDF accessibility behavior and inputs in
README.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
da_build/check_pdf_accessibility.py |
New checker script that runs veraPDF UA1 validation, buckets failures by severity, and outputs annotations + job summary. |
da_build/action.yml |
Adds inputs and new steps to install veraPDF and invoke the checker as part of da_build. |
README.md |
Documents PDF accessibility checking and the new action inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+17
to
+29
| verapdf-validation-mode: | ||
| description: >- | ||
| How to report PDF/UA-1 accessibility failures found by veraPDF. | ||
| 'warning' annotates the job without failing it (default). | ||
| 'error' fails the build. | ||
| default: "warning" | ||
| verapdf-strict: | ||
| description: >- | ||
| Enable strict PDF/UA-1 checking. | ||
| When 'false' (default), tab-order and annotation structure rules for form | ||
| fields are suppressed because forms are often flattened before users see | ||
| them. Set to 'true' to treat those rules as failures. | ||
| default: "false" |
Comment on lines
+111
to
+121
| - name: Install veraPDF | ||
| run: | | ||
| # veraPDF 1.28+ is required for compatibility with Java 21 (GitHub Actions default). | ||
| VERAPDF_VERSION="1.28.1" | ||
| VERAPDF_MINOR="1.28" | ||
| INSTALL_DIR="${RUNNER_TEMP}/verapdf" | ||
|
|
||
| if command -v verapdf &>/dev/null; then | ||
| echo "veraPDF already available: $(verapdf --version 2>&1 | head -1)" | ||
| exit 0 | ||
| fi |
Comment on lines
+184
to
+187
| """Run veraPDF on a list of PDFs; return (stdout_xml, stderr).""" | ||
| cmd = [verapdf_cmd, "--flavour", "ua1", "--format", "xml"] + [str(p) for p in pdfs] | ||
| result = subprocess.run(cmd, capture_output=True, text=True, timeout=300) | ||
| return result.stdout, result.stderr |
Comment on lines
+428
to
+430
| # Passing PDFs | ||
| passing = [r for r in results if r.get("compliant")] | ||
| if passing: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds automated PDF accessibility validation to the da_build action using veraPDF (https://verapdf.org/), ensuring that PDFs in Docassemble repositories comply with the PDF/UA-1 (ISO 14289-1) standard.
For now, this defaults to being warnings only. In a future version of this action (probably about 30 days?) we will start failing repos that do not pass the PDF accessibility checks.
Key Features
Configuration & Syntax
Add these optional inputs to the da_build step in your workflow:
How to Adjust Strictness
before they reach the user, these rules are often irrelevant.
How to Turn It Off