ENH: Machine-readable validate output with store/reload by yarikoptic · Pull Request #1822 · dandi/dandi-cli

yarikoptic · 2026-03-19T19:25:40Z

Summary

Design plan for machine-readable validate output with store/reload capability. Adds structured output formats, automatic persistence of validation results alongside log files, and the ability to reload and re-render results with different grouping/filtering options.

Key design decisions:

All structured formats (json, json_pp, json_lines, yaml) emit a uniform flat list of ValidationResult records — no envelope/non-envelope split
JSONL as the primary interchange format: cat/jq/grep/vd (VisiData) composable
_record_version field on each record for forward-compatible deserialization
Grouping affects human display only; structured output is always a stable flat schema
Auto-save _validation.jsonl ~~sidecar~~ companion next to existing .log files
Refactor dandi/validate.py + dandi/validate_types.py into dandi/validate/ subpackage
Closes validate: Add -f|--format option to optionally serialize into json, json_pp, json_lines or yaml #1515
Closes Provide easy means for introspecting upload validation failures #1753
Largely replaces Add filtering of issues by type/ID or by file location #1743
Enhances Add filtering of issues by type/ID or by file location #1743, upload,validate: Add --validators option #1737, Tidy up the validate command function in cmd_validate.py #1748

TODO

`--max-per-group` feature (Step 5)

Limits how many results are shown per leaf group (or in the flat list when no grouping). Excess results are replaced by a TruncationNotice placeholder — a distinct data structure (not a ValidationResult), so it won't be confused with real results if the output is saved/reloaded.

Examples (against 147k+ validation results from bids-examples)

Flat truncation — --max-per-group 5 with no grouping:

[DANDI.NO_DANDISET_FOUND] .../2d_mb_pcasl — Path is not inside a Dandiset
[BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
[BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
[BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
[BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
... and 147581 more issues

Grouped truncation — -g severity --max-per-group 3:

=== ERROR (9569 issues) ===
  [DANDI.NO_DANDISET_FOUND] .../2d_mb_pcasl — Path is not inside a Dandiset
  [BIDS.NIFTI_HEADER_UNREADABLE] .../sub-1_T1w.nii.gz — We were unable to parse header data ...
  [BIDS.NIFTI_HEADER_UNREADABLE] .../sub-1_dir-AP_epi.nii.gz — We were unable to parse header data ...
  ... and 9566 more issues
=== HINT (138017 issues) ===
  [BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
  [BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
  [BIDS.JSON_KEY_RECOMMENDED] .../dataset_description.json — A JSON file is missing a key ...
  ... and 138014 more issues

and actually those are colored if output is not redirected

Multi-level leaf-only truncation — -g severity -g id --max-per-group 2:

=== ERROR (9569 issues) ===
  === DANDI.NO_DANDISET_FOUND (107 issues) ===
    [DANDI.NO_DANDISET_FOUND] .../2d_mb_pcasl — Path is not inside a Dandiset
    [DANDI.NO_DANDISET_FOUND] .../7t_trt — Path is not inside a Dandiset
    ... and 105 more issues
  === BIDS.NIFTI_HEADER_UNREADABLE (4336 issues) ===
    [BIDS.NIFTI_HEADER_UNREADABLE] .../sub-1_T1w.nii.gz — We were unable to parse header data ...
    [BIDS.NIFTI_HEADER_UNREADABLE] .../sub-1_dir-AP_epi.nii.gz — We were unable to parse header data ...
    ... and 4334 more issues
  === BIDS.EMPTY_FILE (4954 issues) ===
    ...

Structured output — -g severity -f json_pp --max-per-group 2 emits _truncated placeholders:

{
  "ERROR": [
    { "id": "DANDI.NO_DANDISET_FOUND", "severity": "ERROR", ... },
    { "id": "BIDS.NIFTI_HEADER_UNREADABLE", "severity": "ERROR", ... },
    { "_truncated": true, "omitted_count": 9567 }
  ],
  "HINT": [
    { "id": "BIDS.JSON_KEY_RECOMMENDED", "severity": "HINT", ... },
    { "id": "BIDS.JSON_KEY_RECOMMENDED", "severity": "HINT", ... },
    { "_truncated": true, "omitted_count": 138015 }
  ]
}

Headers show original counts (e.g. "9569 issues") even when only a few are displayed. The _truncated sentinel follows the _record_version naming convention for metadata fields.

Test plan

CLI unit tests for each --format output via click.CliRunner
Round-trip serialization tests for ValidationResult JSONL
--load with multi-file concatenation, mutual exclusivity enforcement
Companion auto-save creation and suppression when --output is used
Upload companion integration test with Docker Compose fixture
Extended grouping: section headers, counts, structured output unaffected
--max-per-group flat truncation, grouped truncation, multi-level, JSON placeholder, no-truncation when under limit
Unit test for _truncate_leaves() helper

Some demos

Codecov Report

❌ Patch coverage is 94.01408% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.28%. Comparing base (fb5e2f0) to head (1c235d0).
⚠️ Report is 8 commits behind head on master.

Files with missing lines	Patch %	Lines
dandi/validate/_io.py	69.56%	7 Missing ⚠️
dandi/validate/_types.py	12.50%	7 Missing ⚠️
dandi/cli/cmd_validate.py	95.38%	6 Missing ⚠️
dandi/upload.py	25.00%	3 Missing ⚠️
dandi/validate/__init__.py	0.00%	3 Missing ⚠️
dandi/files/zarr.py	0.00%	2 Missing ⚠️
dandi/bids_validator_deno/_validator.py	0.00%	1 Missing ⚠️
dandi/cli/tests/test_cmd_validate.py	99.67%	1 Missing ⚠️
dandi/files/bases.py	0.00%	1 Missing ⚠️
dandi/files/bids.py	50.00%	1 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1822      +/-   ##
==========================================
+ Coverage   75.13%   76.28%   +1.15%     
==========================================
  Files          84       87       +3     
  Lines       11931    12457     +526     
==========================================
+ Hits         8964     9503     +539     
+ Misses       2967     2954      -13

Flag	Coverage Δ
unittests	`76.28% <94.01%> (+1.15%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dandi/cli/tests/test_cmd_validate.py

+
+    # First produce a JSONL to load
+    outfile = tmp_path / "input.jsonl"
+    r = CliRunner().invoke(


In general, to fix "variable defined multiple times" issues in this pattern, you either remove the redundant earlier assignment or, if its result should be checked, add the appropriate usage (e.g., assertions) between the assignments. The goal is to ensure every assignment either contributes to program behavior or is removed.

Here, the best fix without changing functionality is to stop assigning the result of the first CliRunner().invoke(...) to r, since r is not used before being reassigned. We still need to perform that first invocation to generate outfile, so we should keep the call but drop the r = part. Concretely, in dandi/cli/tests/test_cmd_validate.py within test_validate_auto_companion_skipped_with_load, change the line:

r = CliRunner().invoke( main, ["validate", "-f", "json_lines", "-o", str(outfile), str(simple2_nwb)] )

to:

CliRunner().invoke( main, ["validate", "-f", "json_lines", "-o", str(outfile), str(simple2_nwb)] )

No imports, helper methods, or other definitions are needed; this is a localized change to that one assignment.

CodyCBakerPhD · 2026-03-25T00:02:36Z

dandi/cli/cmd_upload.py

    """
    # Avoid heavy imports by importing with function:
-    from ..upload import upload
+    from ..upload import upload as upload_


This is better of course than clobbering the namespace, but in the long term a better name that distinguishes API vs CLI functions (with CLI usually marked private since it is odd to expose them to library) is still preferred

Which would be a breaking change, not saying do it here, just thinking out loud

yeah -- typically we just clobbered click's interfaces (sorry didn't interleave... but at least manually pruned some unrelated)

❯ grep -l 'from \.\.\(.*\) import \1' dandi/cli/cmd_* | xargs grep '^def ' dandi/cli/cmd_delete.py:def delete(paths, skip_missing, dandi_instance, force, devel_debug=False): dandi/cli/cmd_download.py:def download( dandi/cli/cmd_move.py:def move( dandi/cli/cmd_organize.py:def organize( dandi/cli/cmd_upload.py:def upload( dandi/cli/cmd_validate.py:def validate_bids( dandi/cli/cmd_validate.py:def validate( ❯ grep 'from \.\.\(.*\) import \1' dandi/cli/cmd_* dandi/cli/cmd_delete.py: from ..delete import delete dandi/cli/cmd_download.py: from .. import download dandi/cli/cmd_move.py: from .. import move as move_mod dandi/cli/cmd_organize.py: from ..organize import organize dandi/cli/cmd_upload.py: from ..upload import upload dandi/cli/cmd_validate.py:from ..validate import validate as validate_

but frankly and unfortunately here it doesn't matter much since those click functions are not usable as python interfaces! I wish it was otherwise. So, no point of giving them any special names really.

but frankly and unfortunately here it doesn't matter much since those click functions are not usable as python interfaces!

How do you mean? Any other library can import them and manipulate the click groups; sometimes that might even be intentional, but I don't think so here

from dandi.cli.command import main from dandi.cli.cmd_upload import upload # Implying it is not private and intended to be imported and customized import click @main.command("upload2") # Or even re-register under the same name if attempting some nasty injection @click.pass_context def wrapped_original(ctx): click.echo("Before original") # Inject custom code here ctx.invoke(upload) click.echo("After original") # Inject more custom code here

CodyCBakerPhD · 2026-03-25T00:05:52Z

dandi/cli/cmd_upload.py

        jobs, jobs_per_file = jobs_pair

-    upload(
+    sidecar = None


IDK about referring to this as a 'sidecar' since that could get confusing with BIDS language

What is meant is, specifically, might be 'persistent file recording the validation results of this Dandiset'?

Even referring to it as a 'log' could get confusing with our own log file terminology (as in, the ones that contain runtime errors rather than validations)

indeed there is a clash with "sidecar" in BIDS but overall it is the same meaning, just rarely used outside of BIDS. ... but here we could just call it validation_log_path for the variable.

as for the helper , ATM no critically better name instead of current validation_sidecar_path comes to mind.

note that the validation log is not necessarily of a dandiset -- could be of nwb files AFAIK, or potentially multiple dandisets... in this case what matters is that it is associated with the same run for which there is a log.

Contemplated on this and decided to go with "companion" term to describe such files which IMHO is synonymous to sidecar in its meaning here. So, potentially, we could have multiple companion files to accompany .log file

"Companion" works

I understand your meaning of sidecar here more since completing the review

But I would still like to avoid too many overlapping terms for things in our ecosystem, so "companion" it is

dandi/validate/__init__.py

dandi/validate/_types.py

CodyCBakerPhD · 2026-03-25T00:15:52Z

dandi/organize.py

    yaml_load,
 )
-from .validate_types import (
+from .validate.types import (


Though I do believe these import changes could have been their own PR (which would have been much easier and faster to review and merge)

IIRC I did it in a single commit, so moves are tracked etc. could prep as a separate PR if needed, or just mark files "viewed" with such to hide away for now

or just mark files "viewed" with such to hide away for now

That is exactly what I do

Just letting you know for 'next time' (if there is)

CodyCBakerPhD · 2026-03-25T00:31:53Z

dandi/cli/cmd_validate.py

+class TruncationNotice:
+    """Placeholder indicating omitted results in truncated output."""
+
+    omitted_count: int


Please add a description of this dataclass field to the docstring of the class

I think so far we never did that in this code base... we had discussion with John awhile back about that and agreed to follow some convention like

dandi/move.py:@dataclass dandi/move.py-class Movement: dandi/move.py- """A movement/renaming of an asset""" dandi/move.py- dandi/move.py- #: The asset's original path dandi/move.py- src: AssetPath dandi/move.py- #: The asset's destination path dandi/move.py- dest: AssetPath dandi/move.py- #: Whether to skip this operation because an asset already exists at the dandi/move.py- #: destination dandi/move.py- skip: bool = False

so we do get properly annotated within our sphinx docs on RTD: https://dandi.readthedocs.io/en/latest/modref/generated/dandi.move.html#dandi.move.Movement

There are established ways of doing it in pydocstyle (Google or NumPy); just asking for whatever gets the information to the developer at the end of the day

CodyCBakerPhD · 2026-03-25T00:49:55Z

dandi/cli/cmd_validate.py

+    type=click.Choice(["human", "json", "json_pp", "json_lines", "yaml"]),
+    default="human",


"human" is a bit of an odd value to specify here, especially against the others

Other validators refer to this as a 'summary' or something like that, right?

we could call it text or just default... I will go for text

I saw render functions below, so came to mind rendered but that would be "suggesting more than it is"... I feel text is ok

text is good and accurate 👍

CodyCBakerPhD · 2026-03-25T00:52:01Z

dandi/cli/cmd_validate.py

-    if not paths:
-        paths = (os.curdir,)
-    # below we are using load_namespaces but it causes HDMF to whine if there
-    # is no cached name spaces in the file.  It is benign but not really useful
-    # at this point, so we ignore it although ideally there should be a formal
-    # way to get relevant warnings (not errors) from PyNWB
-    ignore_benign_pynwb_warnings()


Again, pointing out that if this PR had been broken into smaller modular PRs, the negative breaks of this changelog may have aligned better with the relevant changes to catch any relevant drops or alterations from current behavior

CodyCBakerPhD · 2026-03-25T00:55:51Z

dandi/cli/cmd_validate.py

+    filtered = _filter_results(results, min_severity, ignore)
+
+    if output_format == "human":
+        _render_human(filtered, grouping, max_per_group=max_per_group)


The odd choice of 'human' for the value, as mentioned above, is especially observed here. This function is not creating Soylent Green.

Please choose a better name, such as '_format_summary' or '_generate_summary'

note: there is a dedicate summary option to add summary statement at the bottom!

Hmmm so the 'text' mode is not truly a summary/aggregate, just 'more human-readable' styling than JSON?

Still reports all invalidations? (though subject to filtering/grouping rules)?

CodyCBakerPhD · 2026-03-25T01:00:34Z

dandi/cli/cmd_validate.py

+def _get_formatter(
+    output_format: str, out: IO[str] | None = None
+) -> JSONFormatter | JSONLinesFormatter | YAMLFormatter:


Seems odd for there not to be a 'Human (pending rename) Formatter' here to make clearer what the subformat even means

dandi/cli/cmd_validate.py

CodyCBakerPhD · 2026-03-29T16:06:09Z

Completed first round of review

Some additional requests:

Your examples in the PR description are very useful to see, but they might not persist as 'truth' forever

All the current tests are effectively low-level unit tests asserting direct small aspects against sidecar contents

So I would LOVE to see at least one integration test that actually compares the full resulting output file content against an expected case (ideally one per supported format). This also makes it much easier to quickly show other people what the output is expected to look like (and easy to copy/paste into a presentation!)

Also: the code for the new grouping feature seems (to me anyway, from the annotation typing to the CLI option specification) to support multiple selections at a time; but the only param test I see is for one grouping option at a time. Can you either point me to a test for the multi-selection case, or add one?

Lastly - maybe the last time I will harp on it - massive slop PRs like this make it harder for YOU (and your agent) to quickly and efficiently address all requested changes at once. Parts of this PR could have been merged directly, others could be patched up with AI in a matter of minutes, but the entire submission done together (not to mention further rounds of review) makes the process more arduous.

- Rename "human" output format to "text" throughout cmd_validate and tests (Click option, default values, function names _render_human → _render_text, _render_human_grouped → _render_text_grouped, test names, docstrings) - Add field docstring to TruncationNotice.omitted_count - Fix CodeQL warning: remove unused `r =` assignment in test_validate_load - Use match statements in _get_formatter and _group_key - Simplify cmd_upload sidecar path derivation to conditional expression - Merge implicit string concatenation in validate/io.py warning Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

dandi/cli/cmd_validate.py

+def _get_formatter(
+    output_format: str, out: IO[str] | None = None
+) -> JSONFormatter | JSONLinesFormatter | YAMLFormatter:


In general, to address "explicit returns mixed with implicit (fall through) returns", ensure that every function with explicit return statements also has an explicit return at the end, even if it's just return None (or another appropriate sentinel) and even if that line is theoretically unreachable.

For _get_formatter, the best fix that does not change functionality is to add an explicit return at the end of the function with a value that is consistent with the annotated return type. Since all valid paths already return or raise, this final return will never be executed in practice; its purpose is just to satisfy static analysis. Given the return type JSONFormatter | JSONLinesFormatter | YAMLFormatter, the most appropriate explicit return is to raise an error or return a value of that union. We already raise a ValueError in the default case. To avoid changing behavior, we should not alter that; instead, we can add a final return JSONFormatter(out=out) as a safe, unreachable default (or, if desired, choose one of the existing default behaviors). This keeps behavior the same for all reachable paths and silences the warning. The edit is confined to dandi/cli/cmd_validate.py, in the _get_formatter function, by appending one line after the match block.

No new methods or imports are needed; we already import JSONFormatter at the top of the file.

dandi/cli/cmd_validate.py

+        raise SystemExit(1)
+
+
+def _group_key(issue: ValidationResult, grouping: str) -> str:


candleindark · 2026-04-02T06:41:01Z

dandi/cli/cmd_validate.py

+        and filtered
+        and (obj := getattr(ctx, "obj", None)) is not None
+    ):
+        _auto_save_companion(filtered, obj.logfile)


The companion is saved from filtered (post-filter results) rather than the raw results list. This undermines the purpose of --load: if a user runs dandi validate --min-severity ERROR, only ERROR+ records end up in the companion. When they later do dandi validate --load companion.jsonl --min-severity HINT, the HINT issues are permanently gone.

The unfiltered results list — collected before _filter_results() is applied — should be what gets written here. Filtering should be a rendering concern, not a persistence concern.

great catch! moreover, makes no sense to save what is loaded! let's move it up right after _collect_results and simplify conditioning to match description in --help

…e/ subpackage Pure file move with no content changes, plus __init__.py re-exports for backward compatibility. Imports will be updated in the next commit. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Update imports across 13 files to use the new subpackage structure: - dandi.validate_types → dandi.validate.types - dandi.validate → dandi.validate.core (for explicit imports) - Relative imports adjusted accordingly Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

- test_validate.py → dandi/validate/tests/test_core.py - test_validate_types.py → dandi/validate/tests/test_types.py - Update relative imports in moved test files - Fix circular import: don't eagerly import core in __init__.py Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

…ate CLI Decompose the monolithic validate() click command into helpers: - _collect_results(): runs validation and collects results - _filter_results(): applies min-severity and ignore filters - _process_issues(): simplified, no longer handles ignore (moved to _filter) No behavior changes; all existing tests pass unchanged. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Design plan for enhancing `dandi validate` with: - Structured output formats (-f json/json_pp/json_lines/yaml) - Auto-save _validation.jsonl sidecar alongside .log files - --load to reload/re-render stored results with different groupings - Upload validation persistence for later inspection - Extended grouping options (severity, id, validator, standard, dandiset) - Refactoring into dandi/validate/ subpackage (git mv separately) - _record_version field on ValidationResult for forward compatibility - VisiData integration via native JSONL support Addresses #1515, #1753, #1748; enhances #1743. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Add record_version: str = "1" for forward-compatible serialization. Uses no underscore prefix since Pydantic v2 excludes underscore-prefixed fields from serialization. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Add -f/--format {human,json,json_pp,json_lines,yaml} to produce structured output using existing formatter infrastructure. Structured formats suppress colored text and 'No errors found' message. Exit code still reflects validation results. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

- Create dandi/validate/io.py with write/append/load JSONL utilities and validation_sidecar_path() helper - Add -o/--output option to write structured output to file - Auto-save _validation.jsonl sidecar next to logfile when using structured format without --output Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Add --summary/--no-summary flag that shows statistics after validation: total issues, breakdown by severity, validator, and standard. For human output, printed to stdout; for structured formats, printed to stderr. Also refactors _process_issues into _render_human (no exit) + _exit_if_errors, keeping _process_issues as backward-compatible wrapper. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Add --load to reload previously-saved JSONL validation results and re-render them with different formats/filters/grouping. Mutually exclusive with positional paths. Exit code reflects loaded results. Skip auto-save sidecar when loading. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

- Add validation_log_path parameter to upload() - In upload validation loop, append results to sidecar via append_validation_jsonl() when validation_log_path is set - CLI cmd_upload derives sidecar path from logfile and passes it Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Fix mypy errors by using IO[str] instead of object for file-like output parameters in _print_summary, _get_formatter, and _render_structured. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

When --output is given without explicit --format, infer the format from the file extension: .json → json_pp, .jsonl → json_lines, .yaml/.yml → yaml. Error only if extension is unrecognized. Update design doc to reflect this behavior. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Add severity, id, validator, standard, and dandiset as --grouping options. Uses section headers with counts (e.g. "=== ERROR (5 issues) ===") for human output. Structured output is unaffected (always flat). Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

Limit how many results are shown per leaf group (or in the flat list when no grouping is used). Excess results are replaced by a TruncationNotice placeholder — a distinct dataclass (not a ValidationResult) so consumers can isinstance() check. - TruncationNotice dataclass + LeafItem/TruncatedResults type aliases - _truncate_leaves() walks the grouped tree, caps leaf lists - Human output: "... and N more issues" in cyan - Structured output: {"_truncated": true, "omitted_count": N} sentinel - Headers show original counts including omitted items - Works without grouping (flat list) and with multi-level grouping Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

The _auto_save_sidecar() call was only in the structured-to-stdout branch, so the default human format (the most common usage) never wrote the _validation.jsonl sidecar next to the log file. Move the sidecar write and _exit_if_errors() into a shared path that runs after all rendering branches. The sidecar is now written whenever there are results, unless --output or --load is active. Also update the validate docstring/help text to document the sidecar behavior, and update the design spec (Phase 1b, Phase 3, testing strategy) to reflect the --validation-log CLI option for upload and proper CLI integration testing via CliRunner through main(). Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

- Rename "human" output format to "text" throughout cmd_validate and tests (Click option, default values, function names _render_human → _render_text, _render_human_grouped → _render_text_grouped, test names, docstrings) - Add field docstring to TruncationNotice.omitted_count - Fix CodeQL warning: remove unused `r =` assignment in test_validate_load - Use match statements in _get_formatter and _group_key - Simplify cmd_upload sidecar path derivation to conditional expression - Merge implicit string concatenation in validate/io.py warning Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

…ave companion unfiltered in cmd_validate In BIDS, "sidecar" specifically refers to .json files accompanying data files. Rename all internal references to the _validation.jsonl file from "sidecar" to "companion" to avoid confusion: - validation_sidecar_path() → validation_companion_path() - _auto_save_sidecar() → _auto_save_companion() - Variable names, docstrings, comments, and spec prose The only remaining "sidecar" reference is in validate/types.py where it correctly describes BIDS sidecar JSON files. Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

Extend grouping test coverage from only severity to all grouping values and composite (multi-level) grouping specs: - Parametrize text and JSON CLI tests with 8 specs each: 5 single values (severity, id, validator, standard, dandiset) + 3 composite (severity+id, validator+severity, id+validator) - Parametrize --load and --output tests with single and composite specs - Add _grouping_opts() helper to compose -g args, reused across tests - Assert known issue ID (DANDI.NO_DANDISET_FOUND) in output - Assert nested indentation for composite groupings in text format - Assert nested dict structure for composite groupings in JSON format Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

Eliminate duplication between write_validation_jsonl and append_validation_jsonl by adding a keyword-only `append` parameter to write_validation_jsonl. The two functions had identical bodies differing only in the file open mode ("w" vs "a"). Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

Refactoring of codebase into dandi/validate/ subpackage for larger #1822

Pure git mv to preserve rename tracking. Import updates follow in the next commit. Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

- Update all import paths to use _io, _core, _types across 16 files - Add io functions to __init__.py re-exports and __all__ - Change load_validation_jsonl from variadic *paths to Iterable[paths] - Move record_version check from io loader into ValidationResult.model_post_init (fires regardless of load method) Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

yarikoptic added the cmd-validate label Mar 19, 2026

yarikoptic mentioned this pull request Mar 19, 2026

Provide Python and CLI interfaces for validation of OME zarrs ome-zarr-models/ome-zarr-models-py#183

Open

github-advanced-security bot found potential problems Mar 19, 2026

View reviewed changes

dandi/cli/tests/test_cmd_validate.py Fixed Show fixed Hide fixed

yarikoptic added the enhancement New feature or request label Mar 19, 2026

yarikoptic requested a review from candleindark March 19, 2026 22:52

yarikoptic mentioned this pull request Mar 19, 2026

What if all validation records were in .jsonl files?! bids-standard/bids-examples#548

Closed

yarikoptic added UX cmd-upload labels Mar 19, 2026

yarikoptic force-pushed the enh-validators branch from 4894aa1 to 2ddd5d4 Compare March 20, 2026 00:10

yarikoptic added the minor Increment the minor version when merged label Mar 20, 2026

yarikoptic requested review from CodyCBakerPhD and bendichter March 20, 2026 00:13

yarikoptic mentioned this pull request Mar 20, 2026

Please submit pointers to your validator output formats con/validation#1

Open

yarikoptic marked this pull request as ready for review March 20, 2026 00:30

yarikoptic requested a review from asmacdo March 20, 2026 19:10

yarikoptic mentioned this pull request Mar 20, 2026

Add filtering of issues by type/ID or by file location #1743

Draft

1 task

github-advanced-security bot found potential problems Mar 21, 2026

View reviewed changes

CodyCBakerPhD reviewed Mar 25, 2026

View reviewed changes

dandi/validate/__init__.py Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 25, 2026

View reviewed changes

dandi/validate/_types.py Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 25, 2026

View reviewed changes

CodyCBakerPhD requested changes Mar 25, 2026

View reviewed changes

dandi/cli/cmd_validate.py Show resolved Hide resolved

CodyCBakerPhD assigned yarikoptic Mar 29, 2026

github-advanced-security bot found potential problems Apr 1, 2026

View reviewed changes

candleindark reviewed Apr 2, 2026

View reviewed changes

yarikoptic and others added 14 commits April 2, 2026 16:21

fix: add proper type annotations to cmd_validate helpers

a66ff59

Fix mypy errors by using IO[str] instead of object for file-like output parameters in _print_summary, _get_formatter, and _render_structured. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>

yarikoptic mentioned this pull request Apr 2, 2026

Refactoring of codebase into dandi/validate/ subpackage for larger #1822 #1830

Merged

yarikoptic and others added 7 commits April 2, 2026 17:00

DOC: add DEVELOPMENT.md about doc for doclass attributes

9881001

yarikoptic force-pushed the enh-validators branch from 352b7f5 to 9881001 Compare April 2, 2026 21:01

yarikoptic added a commit that referenced this pull request Apr 2, 2026

Merge pull request #1830 from dandi/enh-validators-base

f14c1ee

Refactoring of codebase into dandi/validate/ subpackage for larger #1822

yarikoptic and others added 2 commits April 2, 2026 23:12

refactor: rename validate submodules to private (_io, _core, _types)

87d4c81

Pure git mv to preserve rename tracking. Import updates follow in the next commit. Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>

@@ -368,7 +368,12 @@
                     case _:
                         raise ValueError(f"Unknown format: {output_format}")
+                # Fallback return to satisfy static analysis; all valid paths above either
+                # return a formatter or raise ValueError, so this line is not expected
+                # to be reached at runtime.
+                return JSONFormatter(out=out)
             def _render_structured(
                 results: list[ValidationResult],
                 output_format: str,

		type=click.Choice(["human", "json", "json_pp", "json_lines", "yaml"]),
		default="human",

		raise SystemExit(1)


		def _group_key(issue: ValidationResult, grouping: str) -> str:

Conversation

yarikoptic commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TODO

--max-per-group feature (Step 5)

Examples (against 147k+ validation results from bids-examples)

Test plan

Some demos

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Check warning

Uh oh!

Copilot Autofix

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CodyCBakerPhD Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CodyCBakerPhD Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CodyCBakerPhD commented Mar 29, 2026

Uh oh!

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

yarikoptic commented Mar 19, 2026 •

edited

Loading

`--max-per-group` feature (Step 5)

codecov bot commented Mar 19, 2026 •

edited

Loading

CodyCBakerPhD Mar 26, 2026 •

edited

Loading

CodyCBakerPhD Mar 25, 2026 •

edited

Loading