feat: adapt vaf extraction for non-vaf benchmarks and fix liftover by famosab · Pull Request #174 · snakemake-workflows/dna-seq-benchmark

famosab · 2026-05-06T09:10:44Z

This adds the vaf from the callset vcf (if available) to the final fp table.

Summary by CodeRabbit

New Features
- Generates TP and TP-baseline aggregated tables for callsets and benchmarks alongside FP/FN results.
Documentation
- README updated to document new TP/TP-baseline outputs and clarify the --notemp guidance for retaining intermediates.
Refactor
- FP/FN table generation refined to conditionally include VAF when available, adjusting output column behavior.
Chores
- Several FP/FN-related outputs are no longer marked temporary and are written as persistent result files.

coderabbitai · 2026-05-06T09:15:03Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f456088-bda2-40b3-a6e8-2ed5013851bb

📥 Commits

Reviewing files that changed from the base of the PR and between c881d9c and 0ff149f.

📒 Files selected for processing (2)

workflow/Snakefile
workflow/rules/eval-results.smk

🚧 Files skipped from review as they are similar to previous changes (1)

workflow/Snakefile

📝 Walkthrough

Walkthrough

PR adds VAF-aware FP/FN expression builders, reorders and reformats contig/liftover helper logic, expands classification outputs to include tp and tp-baseline, makes certain FP/FN-related outputs persistent, and documents the new TP deliverables.

Changes

FP/FN/TP classification and expression support

Layer / File(s)	Summary
FP/FN expression helpers with VAF support `workflow/rules/common.smk`	New `get_fp_fn_expression(wildcards)` builds CHROM/POS/ALT/REF and conditionally appends a VAF field based on `wildcards.classification` (benchmark VAF for `fn`/`tp-baseline`, callset VAF otherwise); `get_rename_expression(wildcards)` derives a 4- or 5-field rename map from that expression.
Helper function reordering and formatting `workflow/rules/common.smk`	Reorder branching in contig/liftover helpers to check `genome-build == grch37` before dict-vcf merge handling; add inline comment in `get_callset_correct_contigs`; reformat `_get_nonempty_coverages` with a parenthesized multi-line `with`.
Rule eval input expansion for tp/tp-baseline `workflow/Snakefile`	Extend `rule all` and `rule eval` input expansions for `results/report/fp-fn/callsets/{callset}/{classification}` and `results/fp-fn/benchmarks/{benchmark}.{classification}.tsv` to include `["fp","fn","tp","tp-baseline"]`.
Make FP/FN/precision-recall outputs persistent `workflow/rules/eval-results.smk`	Change several outputs (germline `extract_fp_fn`, somatic `reformat_fp_fn_tp_tables` `renamed_table`, and `collect_stratifications` TSV) from `temp(...)` to persistent `results/...tsv` paths.
Documentation updates `README.md`	Add optional per-callset and per-benchmark TP TSV entries under `results/fp-fn/` and clarify `--notemp` guidance for keeping intermediates during debugging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

snakemake-workflows/dna-seq-benchmark#165: Also modifies FP/FN column-building helpers and VAF-related expression scaffolding.
snakemake-workflows/dna-seq-benchmark#120: Related changes to workflow/Snakefile classification expansions for fp/fn outputs.
snakemake-workflows/dna-seq-benchmark#168: Related edits to workflow/rules/eval-results.smk around which FP/FN-related TSVs are wrapped in temp(...).

Suggested reviewers

BiancaStoecker
johanneskoester

Poem

🐰 A rabbit hops through lines of code,
weaving VAF where variants showed.
TP joins FP and FN in queue,
branches tidy, outputs true.
Debugging carrots, --notemp too!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references VAF extraction for non-VAF benchmarks and liftover fixes, matching key changes in the PR that refactor FP/FN expression logic to handle VAF conditionally and reorganize liftover path handling.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/vaf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@workflow/rules/common.smk`:
- Around line 613-615: The conditional uses the unsupported
wildcards.get("classification") API; update it to follow the file's established
pattern by checking hasattr(wildcards, "classification") and then comparing
wildcards.classification to "fn" or "tp-baseline" (i.e., replace
wildcards.get("classification") == "fn" / == "tp-baseline" with a guarded
attribute access using hasattr and wildcards.classification) so the rule's
wildcard checks match the rest of the file's usage.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 796fd9b4-a80c-432a-adba-67adb6b37d42

📥 Commits

Reviewing files that changed from the base of the PR and between 5fa52d3 and e4c9bc8.

📒 Files selected for processing (1)

workflow/rules/common.smk

coderabbitai · 2026-05-06T09:18:53Z

+        wildcards.get("classification") == "fn"
+        or wildcards.get("classification") == "tp-baseline"
+    ):


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In Snakemake (v7.17+ and current), does the wildcardsobject support.get("name"), or is attribute/index access the supported API?

💡 Result:

In Snakemake (v7.17+ and current), the supported API for the wildcards object is attribute access (e.g. wildcards.sample or {wildcards.sample}), not .get("name"). Authoritative docs describe accessing the wildcards via the wildcards namespace like input/output (e.g. {wildcards.group} and wildcards.sample) [1][2]. While some community answers mention wildcards.get("sample"), that is not documented in the official Snakemake docs as the intended/guaranteed interface for the wildcards object. The official guidance consistently uses attribute access [1][2]. If you want safe access to an optional wildcard, use Python logic around the known wildcard keys/expected structure rather than relying on .get(). For example, validate that the wildcard exists before referencing it as an attribute. Answer: Prefer attribute/index-style access; .get("name") is not the documented/guaranteed supported API in v7.17+ and current.

Citations:

1: https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html

2: https://snakemake.readthedocs.io/en/v7.0.2/tutorial/basics.html

🏁 Script executed:

cd workflow/rules && wc -l common.smk

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 96

🏁 Script executed:

# Examine the specific lines mentioned (613-615) and surrounding context sed -n '605,625p' workflow/rules/common.smk | cat -n

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 890

🏁 Script executed:

# Search for all instances of wildcards.get usage in the file rg "wildcards\.get\(" workflow/rules/common.smk

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 189

🏁 Script executed:

# Search for hasattr patterns with wildcards in the file rg "hasattr\(wildcards" workflow/rules/common.smk -A 1 -B 1

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 1681

Use wildcard attribute access with hasattr() guards instead of wildcards.get(...) for consistency and safety.

The file consistently uses hasattr(wildcards, "fieldname") + attribute access (found in 10+ functions), but lines 613-615 use the undocumented wildcards.get("classification") API. Snakemake's official documentation only specifies attribute access; .get() is not a guaranteed/supported interface.

Proposed fix

- if ( - wildcards.get("classification") == "fn" - or wildcards.get("classification") == "tp-baseline" - ): + classification = ( + wildcards.classification if hasattr(wildcards, "classification") else None + ) + if classification in {"fn", "tp-baseline"}:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workflow/rules/common.smk` around lines 613 - 615, The conditional uses the unsupported wildcards.get("classification") API; update it to follow the file's established pattern by checking hasattr(wildcards, "classification") and then comparing wildcards.classification to "fn" or "tp-baseline" (i.e., replace wildcards.get("classification") == "fn" / == "tp-baseline" with a guarded attribute access using hasattr and wildcards.classification) so the rule's wildcard checks match the rest of the file's usage.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@workflow/rules/common.smk`:
- Around line 496-500: The reformatted with-statement uses Python 3.10-only
syntax by placing "as f" inside parentheses; change it so the context manager
expression is assigned to a local variable first (e.g., cm =
checkpoints.stat_truth.get(benchmark=benchmark, cov=cov).output[0].open()) and
then use the traditional form "with cm as f:" so that the "as f" remains outside
parentheses; locate the expression built from
checkpoints.stat_truth.get(...).output[0].open() and refactor as described.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4fbc3c28-eed5-4874-b61d-814049122eef

📥 Commits

Reviewing files that changed from the base of the PR and between e4c9bc8 and e6642e8.

📒 Files selected for processing (1)

workflow/rules/common.smk

coderabbitai · 2026-05-06T11:22:17Z

+        with (
+            checkpoints.stat_truth.get(benchmark=benchmark, cov=cov)
+            .output[0]
+            .open() as f
+        ):


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Look for declared Python version constraints in setup/config files fd -e cfg -e toml -e txt -e yaml --exclude ".snakemake" | xargs grep -l "python_requires\|python >=\|python>=\|Programming Language :: Python :: 3\." 2>/dev/null | head -20 rg -n "min_version|python_requires|python >=" --type yaml --type toml 2>/dev/null | head -20

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 63

🏁 Script executed:

# Find all requirements/environment files fd -type f $ -name "requirements*.txt" -o -name "environment*.yml" -o -name "environment*.yaml" -o -name "setup.py" -o -name "pyproject.toml" -o -name "Snakefile" $ | head -20

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 252

🏁 Script executed:

# Look for Snakemake version constraint or min_version in Snakefiles and rules rg -l "Snakefile|\.smk" | head -10

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 100

🏁 Script executed:

# Check the actual code at lines 496-500 to confirm the syntax sed -n '490,505p' workflow/rules/common.smk

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 500

🏁 Script executed:

# Find all constraint/config files find . -maxdepth 3 $ -name "*.txt" -o -name "*.yml" -o -name "*.yaml" -o -name "*.toml" -o -name "*.py" $ 2>/dev/null | grep -iE "(require|environment|setup|config|snakemake)" | head -30

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 163

🏁 Script executed:

# Check the main Snakefile for min_version cat workflow/Snakefile | head -50

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 1601

🏁 Script executed:

# Look for requirements in the root and common directories cat requirements.txt 2>/dev/null || cat environment.yml 2>/dev/null || cat pyproject.toml 2>/dev/null || echo "No standard config found in root"

Repository: snakemake-workflows/dna-seq-benchmark

Length of output: 114

🌐 Web query:

Snakemake 7.17.1 Python version requirement minimum

💡 Result:

Minimum Python version required for Snakemake 7.17.1: Python 3.5 (i.e., “Python ≥3.5”). [1] This requirement is stated in the Snakemake 7.17.1 documentation under the setup prerequisites (“Python ≥3.5”). [1]

Citations:

1: https://snakemake.readthedocs.io/en/v7.17.1/tutorial/setup.html

🌐 Web query:

Snakemake 7.17 Python 3.8 compatibility requirements

💡 Result:

Snakemake 7.17.x requires Python >= 3.7, so it is compatible with Python 3.8 on the Python-version level.[1] However, whether it works end-to-end can depend on dependency availability/builds in the installation method you use; there are historical reports that Python 3.8 could be problematic with certain dependency stacks (e.g., datrie) even though Snakemake’s own minimum Python version includes 3.8.[2] Supporting evidence: - PyPI lists Snakemake 7.17.0 with “Python: >=3.7”, which includes Python 3.8.[1] - Snakemake’s 7.17.x changelog notes a “require at least Python 3.7” breaking change for 7.17.0.[3] - GitHub issue discussion indicates installation attempts on Python 3.8 failed in the past due to a dependency constraint, illustrating that “Python >=3.7” doesn’t automatically guarantee every dependency stack will install on 3.8.[2] Bottom line for “Python 3.8 compatibility requirements”: - Required Python version: >= 3.7 (so Python 3.8 is within the supported range).[1] - No additional “Python 3.8 only” requirement is stated in the Snakemake 7.17.0 PyPI metadata beyond the >=3.7 constraint.[1] - Practical compatibility may still be constrained by non-Python dependencies / transitive dependencies depending on your installation method/environment.[2][4]

Citations:

1: https://pypi.org/project/snakemake/7.17.0/

2: Python 3.8 not supported snakemake/snakemake#130

3: https://snakemake.readthedocs.io/en/v7.31.1/project_info/history.html

4: https://snakemake.readthedocs.io/en/v7.17.0/getting_started/installation.html

with (… as f): is Python 3.10+ syntax — incompatible with declared minimum Python 3.7.

The reformatted with statement at lines 496–500 places as f inside the parentheses, which is the Python 3.10+ syntax form. The project declares min_version("7.17.1") in the Snakefile, and Snakemake 7.17.1 supports Python ≥ 3.7. This code will fail with a SyntaxError on Python 3.7, 3.8, and (officially) 3.9.

Extract the context manager expression to a local variable so as f remains outside the parentheses:

Suggested fix

- with ( - checkpoints.stat_truth.get(benchmark=benchmark, cov=cov) - .output[0] - .open() as f - ): + stat_file = checkpoints.stat_truth.get(benchmark=benchmark, cov=cov).output[0] + with stat_file.open() as f:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workflow/rules/common.smk` around lines 496 - 500, The reformatted with-statement uses Python 3.10-only syntax by placing "as f" inside parentheses; change it so the context manager expression is assigned to a local variable first (e.g., cm = checkpoints.stat_truth.get(benchmark=benchmark, cov=cov).output[0].open()) and then use the traditional form "with cm as f:" so that the "as f" remains outside parentheses; locate the expression built from checkpoints.stat_truth.get(...).output[0].open() and refactor as described.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

workflow/rules/eval-results.smk (1)
2-23: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update extract-fp-fn.py to handle tp and tp-baseline classifications.

The script currently only filters for fp and fn classifications. With the new checkpoint at line 208 expanding classification=["fp", "fn", "tp", "tp-baseline"], the rule will execute with all four values, but the script will produce empty output files for tp and tp-baseline because neither condition matches and execution falls through to continue. Add handling for Class.TP and Class.TP_BASELINE to extract those records correctly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workflow/rules/eval-results.smk` around lines 2 - 23, The extract-fp-fn.py
script currently only handles Class.FP and Class.FN and falls through for other
classifications; update the script (the branch that checks the variant
classification) to also handle Class.TP and Class.TP_BASELINE by adding cases
that select and write true-positive records (and baseline true-positives)
analogous to the existing FP/FN logic so that when Snakemake's extract_fp_fn
rule runs with classification values "tp" or "tp-baseline" it emits the correct
TSV rows instead of skipping (reference the Class enum values Class.TP and
Class.TP_BASELINE and the code block where you currently have the FP/FN
filtering and the continue).

🧹 Nitpick comments (3)

workflow/rules/eval-results.smk (3)
221-257: 💤 Low value

Optional: rename the checkpoint= input key.

Both filter_shared_fn and filter_unique use checkpoint= as the input dict key, which collides visually with the Snakemake checkpoint keyword and the checkpoints.<name> accessor. Renaming to e.g. collection_done= keeps the gating intent obvious without overloading reserved-looking names.
♻️ Proposed rename
     input:
         fn="results/fp-fn-tp/benchmarks/{benchmark}.fn.tsv",
-        checkpoint=(
+        collection_done=(
             ancient("results/.checkpoints/fp-fn-tp-collection.done")
             if not config.get("stop-at-fp-fn-tp-collection")
             else []
         ),
Apply the same rename in filter_unique (lines 245-249). No script changes are needed unless snakemake.input.checkpoint is referenced inside the scripts.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workflow/rules/eval-results.smk` around lines 221 - 257, The input key named
checkpoint in the rules filter_shared_fn and filter_unique visually collides
with Snakemake's checkpoint keyword; rename that input key to something like
collection_done in both rules (i.e., change input: checkpoint=(...) to input:
collection_done=(...)) and update any references in the workflow or scripts that
access snakemake.input.checkpoint to use snakemake.input.collection_done instead
so behavior remains unchanged.
27-42: 💤 Low value

Rename the input key tp= to something classification-agnostic.

This rule is now invoked for fp, fn, tp, and tp-baseline, so calling the input tp is misleading inside extract-fp-fn.py (snakemake.input.tp for an fn table reads oddly). Consider vcf= or calls=.
♻️ Proposed rename
 rule extract_fp_fn_tp:
     input:
-        tp="results/vcfeval/{callset}/{cov}/{classification}.norm.vcf",
+        vcf="results/vcfeval/{callset}/{cov}/{classification}.norm.vcf",
Update the corresponding snakemake.input.tp reference in workflow/scripts/ if it exists.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workflow/rules/eval-results.smk` around lines 27 - 42, The input key named tp
in the rule extract_fp_fn_tp is misleading because the rule is used for fp, fn,
tp and tp-baseline; rename the input key to a neutral name like vcf or calls
(e.g., change tp="results/vcfeval/{callset}/{cov}/{classification}.norm.vcf" to
vcf=...) and update any references to snakemake.input.tp in the
workflow/scripts/ helpers (and any usages of get_fp_fn_expression if it
references snakemake.input.tp) to use snakemake.input.vcf (or
snakemake.input.calls) so all callers and the wrapper
("v9.4.0/bio/vembrane/table") remain consistent.
208-218: ⚡ Quick win

Consider demoting checkpoint to rule for clarity.

This is declared as a checkpoint but produces only a static touch file, and downstream rules consume it via path and ancient(...) rather than checkpoints.fp_fn_tp_collection_done.get(...). Without reliance on the dynamic-output-driven DAG re-evaluation that checkpoints provide, a regular rule is functionally equivalent and avoids the checkpoint re-eval overhead. The tp-baseline classification is actively used downstream (in reformat-fp-fn-tp-tables.py and throughout common.smk), so the expanded target set is correct.
♻️ Optional: demote to a regular rule
-checkpoint fp_fn_tp_collection_done:
+rule fp_fn_tp_collection_done:
     input:
         expand(
             "results/fp-fn-tp/benchmarks/{benchmark}.{classification}.tsv",
             benchmark=used_benchmarks,
             classification=["fp", "fn", "tp", "tp-baseline"],
         ),
     output:
         "results/.checkpoints/fp-fn-tp-collection.done",
     shell:
         "mkdir -p $(dirname {output}) && touch {output}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workflow/rules/eval-results.smk` around lines 208 - 218, The rule is declared
as checkpoint fp_fn_tp_collection_done but only creates a static touch file and
is consumed by downstream rules via file paths and ancient(...); change it to a
regular rule named fp_fn_tp_collection_done (demote checkpoint to rule) keeping
the same input expand(...) list (including "tp-baseline") and the same output
and shell, so DAG re-evaluation overhead is avoided; update any references if
you used checkpoints.* access (none expected since consumers use paths and
ancient(...)), and ensure downstream scripts like reformat-fp-fn-tp-tables.py
and references in common.smk continue to rely on the produced
"results/.checkpoints/fp-fn-tp-collection.done" file.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 36-38: Update the fenced code block containing the command
"snakemake --until eval_fp_fn_tp_collection" to include a language identifier
(e.g., change the opening ``` to ```bash) so the snippet is syntax-highlighted
and more readable; locate the code block in README.md and modify only the
opening fence to add the language token.

---

Outside diff comments:
In `@workflow/rules/eval-results.smk`:
- Around line 2-23: The extract-fp-fn.py script currently only handles Class.FP
and Class.FN and falls through for other classifications; update the script (the
branch that checks the variant classification) to also handle Class.TP and
Class.TP_BASELINE by adding cases that select and write true-positive records
(and baseline true-positives) analogous to the existing FP/FN logic so that when
Snakemake's extract_fp_fn rule runs with classification values "tp" or
"tp-baseline" it emits the correct TSV rows instead of skipping (reference the
Class enum values Class.TP and Class.TP_BASELINE and the code block where you
currently have the FP/FN filtering and the continue).

---

Nitpick comments:
In `@workflow/rules/eval-results.smk`:
- Around line 221-257: The input key named checkpoint in the rules
filter_shared_fn and filter_unique visually collides with Snakemake's checkpoint
keyword; rename that input key to something like collection_done in both rules
(i.e., change input: checkpoint=(...) to input: collection_done=(...)) and
update any references in the workflow or scripts that access
snakemake.input.checkpoint to use snakemake.input.collection_done instead so
behavior remains unchanged.
- Around line 27-42: The input key named tp in the rule extract_fp_fn_tp is
misleading because the rule is used for fp, fn, tp and tp-baseline; rename the
input key to a neutral name like vcf or calls (e.g., change
tp="results/vcfeval/{callset}/{cov}/{classification}.norm.vcf" to vcf=...) and
update any references to snakemake.input.tp in the workflow/scripts/ helpers
(and any usages of get_fp_fn_expression if it references snakemake.input.tp) to
use snakemake.input.vcf (or snakemake.input.calls) so all callers and the
wrapper ("v9.4.0/bio/vembrane/table") remain consistent.
- Around line 208-218: The rule is declared as checkpoint
fp_fn_tp_collection_done but only creates a static touch file and is consumed by
downstream rules via file paths and ancient(...); change it to a regular rule
named fp_fn_tp_collection_done (demote checkpoint to rule) keeping the same
input expand(...) list (including "tp-baseline") and the same output and shell,
so DAG re-evaluation overhead is avoided; update any references if you used
checkpoints.* access (none expected since consumers use paths and ancient(...)),
and ensure downstream scripts like reformat-fp-fn-tp-tables.py and references in
common.smk continue to rely on the produced
"results/.checkpoints/fp-fn-tp-collection.done" file.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e544d302-4bee-42a3-98eb-ce51d4c23655

📥 Commits

Reviewing files that changed from the base of the PR and between e6642e8 and fa20213.

📒 Files selected for processing (5)

README.md
workflow/Snakefile
workflow/rules/annotation.smk
workflow/rules/common.smk
workflow/rules/eval-results.smk

🚧 Files skipped from review as they are similar to previous changes (1)

workflow/rules/common.smk

Copilot

Pull request overview

This PR extends the FP/FN/TP table pipeline to optionally include VAF extracted from the callset VCF, adds a convenience “stop after table collection” target, and adjusts contig-correction helper logic intended to fix liftover behavior.

Changes:

Add eval_fp_fn_tp_collection phony target to build aggregated FP/FN/TP (+ TP-baseline) tables without running downstream unique/shared filtering.
Refactor Vembrane table extraction/renaming expressions to include VAF when available (without requiring a benchmark-level VAF config).
Update contig-correction helper functions for liftover/merge selection, and document new TP outputs in the README.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`workflow/Snakefile`	Adds a phony target to build aggregated FP/FN/TP tables as an intermediate stopping point.
`workflow/rules/common.smk`	Adjusts liftover/rename input selection helpers and refactors optional VAF column extraction/renaming logic.
`README.md`	Documents optional TP/TP-baseline aggregated outputs and cleans up intermediate/cleanup bullets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 def get_callset_correct_contigs(wildcards):
    callset = config["variant-calls"][wildcards.callset]
    vcf = callset["path"]
-    if callset.get("rename-contigs", False):
-        return "results/normalized-variants/{callset}.replaced-contigs.vcf.gz"
-    elif callset.get("genome-build", "grch38") == "grch37":
-        return "results/normalized-variants/{callset}.lifted.vcf.gz"
-    elif isinstance(vcf, dict):
+    # Input for liftover must be the pre-liftover callset.
+    if isinstance(vcf, dict):
        return "results/merge-callsets/{callset}.merged.vcf.gz"
    else:
        return get_raw_callset(wildcards)


    callset = config["variant-calls"][wildcards.callset]
    vcf = callset["path"]
-    if isinstance(vcf, dict):
+    if callset.get("genome-build", "grch38") == "grch37":


fix: adapt vaf extraction for non-vaf benchmarks

e4c9bc8

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

fix: enable liftover for grch37 reference (#173)

e6642e8

famosab changed the title ~~feat: adapt vaf extraction for non-vaf benchmarks~~ feat: adapt vaf extraction for non-vaf benchmarks and fix liftover May 6, 2026

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

feat: add rendering of tp table

fa20213

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Comment thread README.md Outdated

famosab added 10 commits May 7, 2026 15:21

fix: linting

c9d7bc2

fix: rule all

d77967c

fix: revert unneeded changes

9317857

fix: revert unneeded changes

6d78fa5

fix: revert unneeded changes

2d903bf

snakefmt

739f285

fix: undo folder renaming

bf3ec27

fix: undo folder renaming

6a9dca7

fix: revert unneeded changes

eae793a

fix: revert unneeded changes

baa479f

famosab requested a review from Copilot May 11, 2026 15:13

Copilot started reviewing on behalf of famosab May 11, 2026 15:13 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

famosab and others added 4 commits May 18, 2026 10:06

fix: add tp to rule all

c881d9c

fix: undo some temp

d7afdb4

fix: add tp to rule all

26e2cb0

Merge branch 'main' into feat/vaf

0ff149f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adapt vaf extraction for non-vaf benchmarks and fix liftover#174

feat: adapt vaf extraction for non-vaf benchmarks and fix liftover#174
famosab wants to merge 17 commits into
mainfrom
feat/vaf

famosab commented May 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

famosab commented May 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

famosab commented May 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading