Skip to content

[DNM]v5.10.0 20260325#3044

Open
FrankApiyo wants to merge 54 commits into
v5.10.0-20260128from
v5.10.0-20260325
Open

[DNM]v5.10.0 20260325#3044
FrankApiyo wants to merge 54 commits into
v5.10.0-20260128from
v5.10.0-20260325

Conversation

@FrankApiyo
Copy link
Copy Markdown
Member

Changes / Features implemented

Steps taken to verify this change does what is intended

Side effects of implementing this change

Before submitting this PR for review, please make sure you have:

  • Included tests
  • Updated documentation

Closes #

kelvin-muchiri and others added 15 commits February 25, 2026 11:25
* Trigger decryption if all media files have been received
* Retry the decryption task if decryption fails because not all media files have been received.
…ed (#2983)

Fix decryption failures for encrypted submissions that were collected before managed encryption was disabled for a form, but submitted after managed encryption was disabled.
Regenerate all .pip files with latest dependency
versions via pip-compile.
pyxform 4.3.1 changed the validation error message for
invalid question names. Update the expected string in
test_survey_preview_endpoint to match.
The pyxform library changed its error message for invalid entity
declarations. The test now expects the new message which focuses
on missing labels rather than single-entity-per-form validation.
User-supplied survey data starting with =, +, -, @, \t, or \r is now
prefixed with a single quote before being written to CSV/XLSX exports.
This prevents spreadsheet applications from interpreting cell values as
executable formulas.

Resolves ONADATA-511
csv_builder and export_builder import each other, so placing the
function in common_tools (which both already depend on) breaks the
cycle.
- Negative numeric strings (e.g. GPS coords like -1.2627557) were
  being incorrectly prefixed with a single quote. Now values starting
  with - followed by a digit or decimal point are left unchanged.
- Also apply sanitize_for_export in to_zipped_csv's write_row which
  was a separate code path bypassing write_to_csv.
OWASP CSV Injection guidance lists \n as a dangerous prefix alongside
\t and \r. Added to _FORMULA_PREFIXES and updated tests.
The previous check only looked at the first char after the dash, so
"-1+1" (a formula) was incorrectly skipped. Now the entire value
(including space-separated GPS tokens) must parse as float(s) to be
considered a safe negative number.
Extends formula injection prevention (CWE-1236) to cover column
headers, labels, and HXL rows — not just data cells. Applies
sanitize_for_export to all 9 header-row write sites across
csv_builder and export_builder.
…ents

- Extend signed-number carve-out to cover + prefix (e.g. "+1.5") in
  addition to - prefix, so legitimate positive-signed values are not
  corrupted during export sanitization.
- Add integration test verifying that label rows containing formula
  prefixes are sanitized in both CSV and XLSX exports.
- Add clarifying code comments per review feedback: explain why "-1+1"
  fails float() parsing, and document the all-tokens-must-parse security
  guarantee in the GPS carve-out logic.
Closes two CSV injection gaps identified against defusedcsv:
- Add % to formula prefix triggers (older LibreOffice builds)
- Escape inner | characters to prevent DDE-style payloads
  (e.g. =DDE|cmd|'/C calc') even in non-prefixed values
Mitigate supply-chain risk by pinning all GitHub Actions to immutable
commit hashes instead of mutable tags. Update trivy-action to v0.35.0,
setup-trivy to v0.2.6, trivy to v0.69.3, and add VEX hub configuration.
Add github-actions Dependabot ecosystem for automated updates.

Ref: aquasecurity/trivy#10425
@FrankApiyo FrankApiyo changed the base branch from main to v5.10.0-20260128 March 25, 2026 08:17
The Docker build fails because openssl-provider-legacy 3.6.1-3 from
Debian unstable tries to overwrite legacy.so owned by bookworm's
libssl3. Adding apt pin priority 100 for unstable prevents its packages
from being pulled in as automatic dependencies.
The merge conflict resolution incorrectly accepted pyxform==4.3.1 from
the incoming branch, but setup.cfg and base.in still depend on the
onaio/pyxform fork at fix-osm-attribute-error. This caused a pip
resolution conflict during Docker build.
The onaio/python-deps:3.10.19-20260119 build image compiles Python
against GLIBC 2.38, but the Bookworm runtime only ships GLIBC 2.36,
causing "GLIBC_2.38 not found" errors at container startup.

Switch the runtime stage from debian:bookworm to debian:trixie which
provides GLIBC 2.38+. This also removes the unstable repo pinning
workaround since Trixie ships the required packages natively.
Replace libpcre3 with libpcre2-8-0 and openjdk-17-jre-headless with
default-jre-headless, as Trixie no longer ships the older packages.
Cherry-picked from c659c5b (PR #3013). Accepted newer dependency
versions already on this branch where conflicts arose.
Cherry-picked from 71aff96 (PR #3013). Switch runtime base to
dhi.io/debian-base:bookworm-debian12 hardened image.
Cherry-picked from 4982a55 (PR #3013). Kept newer dependency
versions already on this branch where conflicts arose.
Cherry-picked from 482c373 (PR #3013). Use Trivy's built-in
@/contrib/html.tpl template instead of downloading separately.
Kept pinned action SHA and trivy-config.
Cherry-picked from f31da68 (PR #3013). Kept v0.69.3 (newer than
incoming v0.69.1) and existing VEX Hub configuration.
Cherry-picked from f53c81a (PR #3013). Kept existing pinned Trivy
setup and VEX Hub configuration.
Cherry-picked from b56117d (PR #3013).
Cherry-picked from 7230c37 (PR #3013). Replace SSH agent with
BuildKit secrets for GitHub token authentication.
Cherry-picked from 1f6f23c. Use meta.outputs.tags and add
provenance: false, switch to GHA cache, add setuptools for Python 3.13.
valigetta 0.2.1 depends on cryptography<45, which conflicts with
cryptography==46.0.5. Upgrade to v0.2.2 which supports cryptography>=46.
The pyxform fork changed the error message for invalid question names
from the old format to a new one that includes the actual question name.
@FrankApiyo FrankApiyo marked this pull request as draft March 26, 2026 07:08
The SavWriter call hardcoded ioLocale="en_US.UTF-8", which calls
locale.setlocale(LC_ALL, "en_US.UTF-8") internally. This fails on
containers that don't have the en_US.UTF-8 locale installed.

The ioLocale parameter is unnecessary because ioUtf8=True (already
set via _get_sav_options) controls UTF-8 encoding of SPSS file data.
The C locale set by ioLocale only affects OS-level string collation
and number display formatting, neither of which affect SPSS output
since numbers are stored as IEEE 754 binary doubles.

Fixes: ONADATA-991
In pyxform 4.0.0+ (commit 75679b3), the Itemset class was introduced
to improve choices handling performance. This changed survey.get("choices")
from returning {list_name: [list_of_dicts]} to {list_name: Itemset}.

The _get_sav_value_labels method falls back to survey.get("choices") when
a question's to_json_dict() has no "children" key. With pyxform 4.x, this
fallback returns an Itemset object which is not iterable, causing:
  TypeError: 'Itemset' object is not iterable

The fix unwraps Itemset objects to their .options tuple (containing Option
objects that support dict-like access via the Mapping protocol) before
iterating over choices.
…call

SavWriter defaults to calling getpass.getuser() when fileLabel is None,
which fails in containers where the UID has no /etc/passwd entry
(OSError: No username set in the environment).
Ensures _get_sav_options always includes fileLabel to prevent
getpass.getuser() from being called during SAV exports.
@FrankApiyo FrankApiyo marked this pull request as ready for review May 27, 2026 10:32
@FrankApiyo FrankApiyo changed the title V5.10.0 20260325 [DNM]v5.10.0 20260325 May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants