NVIDIA-NeMo · lipikaramaswamy · May 13, 2026 · May 8, 2026 · May 9, 2026 · May 9, 2026
@@ -103,6 +103,9 @@ CLAUDE.local.md
 .claude/settings.local.json
 ai/tmp/
 
+# Claude worktrees
+.claude/worktrees/
+
 # Anonymizer execution artifacts
 .anonymizer-artifacts/
 docs/notebook_source/data/synth_bios_sample10_anonymized.csv

@@ -0,0 +1,118 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# AGENTS.md
+
+This file is for agents **developing** NeMo Anonymizer — the codebase you are working in.
+If you are an agent helping a user **anonymize data**, use the [product documentation](https://nvidia-nemo.github.io/Anonymizer/) instead.
+
+**NeMo Anonymizer** detects and protects PII through context-aware entity replacement and LLM-powered rewriting. Users supply a text dataset and a strategy; Anonymizer detects entities and transforms the text.
+
+## Agent compatibility
+
+`AGENTS.md` is the canonical instruction file for coding agents working in this repository. Keep it tool-neutral:
+
+- Use plain Markdown and repository-relative links.
+- Do not rely on vendor-specific include syntax, slash commands, MCP names, or IDE-only behavior.
+- Put tool-specific adapter instructions in thin wrapper files such as `CLAUDE.md`.
+
+## Module Map
+
+`nemo-anonymizer` is a single package with three primary subpackages plus top-level public utilities:
+
+- **`anonymizer.config`** — user-facing configuration: `AnonymizerConfig`, `AnonymizerInput`, replace strategies (`Substitute`, `Redact`, `Annotate`, `Hash`), and rewrite config (`Rewrite`, `EvaluationCriteria`, `RiskTolerance`). New user-facing knobs go here.
+- **`anonymizer.engine`** — internal pipeline implementation: detection, replacement, and rewrite sub-workflows, the NDD adapter, prompt utilities, and all `COL_*` column constants. Never imported directly by users.
+- **`anonymizer.interface`** — user-facing entry points: the `Anonymizer` class, CLI, `AnonymizerResult`, `PreviewResult`, and canonical error types. Thin layer that wires config → engine and exposes results.
+- **`anonymizer.logging`** — public logging configuration (`LoggingConfig`, `configure_logging`) used by the API, CLI, and examples.
+
+NeMo Anonymizer wraps [DataDesigner](https://github.com/NVIDIA-NeMo/DataDesigner) (NDD) for LLM column generation. `NddAdapter.run_workflow()` is the engine boundary for *executing* DataDesigner workflows — engine sub-workflows may declare DataDesigner column configs (e.g. `LLMStructuredColumnConfig`), but they do not call `DataDesigner.create()` or `preview()` directly.
+
+## Core Concepts
+
+- **Entity** — a detected span of text with a label (e.g. `"Alice"` → `first_name`) and character offsets
+- **Latent entity** — an entity detected in rewrite mode that is sensitive but not directly named; used to guide rewriting without explicit replacement
+- **Replacement map** — a per-record dict mapping entity text → substitute value, built by `LlmReplaceWorkflow` and injected into rewrite prompts
+- **Leakage mass** — a weighted score measuring how much sensitive information survives in a rewritten record; drives the repair loop
+- **Utility score** — a 0–1 score measuring how much semantic content the rewritten record preserves
+- **RiskTolerance** — a preset (`minimal` / `low` / `moderate` / `high`) that bundles the leakage threshold, repair behaviour, and human-review flags into a single user-facing knob
+- **Repair loop** — the evaluate → repair → re-evaluate cycle in `RewriteWorkflow`; runs up to `max_repair_iterations` times on failing rows
+- **FailedRecord** — a record that was dropped by an NDD workflow; surfaced explicitly rather than silently lost
+
+## Pipelines
+
+### Replace mode — `AnonymizerConfig(replace=...)`
+
+```
+input_df
+  → EntityDetectionWorkflow.run()              # engine/detection/detection_workflow.py
+        GLiNER detection
+        → parse + tag
+        → LLM augmentation  (add entities GLiNER missed)
+        → LLM validation    (keep / drop candidates)
+        → merge + finalize  → COL_DETECTED_ENTITIES, COL_FINAL_ENTITIES
+  → ReplacementWorkflow.run()                  # engine/replace/replace_runner.py
+        Redact / Annotate / Hash  → applied locally, no LLM
+        Substitute                → LlmReplaceWorkflow → NddAdapter
+  → output: {text_col}_replaced, {text_col}_with_spans, final_entities
+```
+
+### Rewrite mode — `AnonymizerConfig(rewrite=...)`
+
+```
+input_df
+  → EntityDetectionWorkflow.run()              # same as above, plus latent entity tagging
+  → RewriteWorkflow.run()                      # engine/rewrite/rewrite_workflow.py
+        LlmReplaceWorkflow.generate_map_only() # build replacement map for prompt
+        → single NDD adapter call (pipeline_columns):
+              DomainClassificationWorkflow    → _domain, _domain_supplement
+              SensitivityDispositionWorkflow  → _sensitivity_disposition
+              QAGenerationWorkflow            → _quality_qa, _privacy_qa
+              RewriteGenerationWorkflow       → _rewritten_text
+        → evaluate-repair loop (up to max_repair_iterations):
+              EvaluateWorkflow                → leakage_mass, utility_score, _needs_repair
+              RepairWorkflow                  → _rewritten_text (failing rows only)
+        → FinalJudgeWorkflow (non-critical)   → _judge_evaluation, needs_human_review
+  → output: {text_col}_rewritten, utility_score, leakage_mass, needs_human_review, …
+```
+
+Records with no detected entities skip all LLM sub-workflows and pass through with default metrics (utility=1.0, leakage=0.0).
+
+## Config Pattern
+
+`AnonymizerConfig.rewrite` is the user-facing `Rewrite` model. The engine never receives `Rewrite` directly — it receives `EvaluationCriteria` via the `Rewrite.evaluation` property. See that property's docstring for the sync contract (how `risk_tolerance` and `max_repair_iterations` flow into the engine, why production code should not duplicate the mapping).
+
+## NDD Adapter
+
+`NddAdapter.run_workflow()` (`engine/ndd/adapter.py`) is the engine boundary for *executing* DataDesigner workflows. See its docstring for the contract (input/output shapes, `FailedRecord` semantics).
+
+## Prompt Conventions
+
+NDD prompts are inline triple-quoted strings in the workflow file that uses them; there is no separate registry. For DataFrame column references inside templates, use `_jinja()`; for dynamic prompt values, use `substitute_placeholders()`. See each function's docstring for details.
+
+## Structural Invariants
+
+Code conventions enforced in review (future-annotations import, absolute imports, type annotations, SPDX headers, column-name constants) live in [STYLEGUIDE.md](STYLEGUIDE.md).
+
+One pipeline-specific fact worth knowing: `COL_TEXT` is the internal name for the input text column; it's renamed to the user's original column name in final output.
+
+## What NOT To Do
+
+- **Don't duplicate the `Rewrite` → `EvaluationCriteria` mapping** when production code starts from a `Rewrite`; route it through `Rewrite.evaluation`.
+- **Don't execute DataDesigner workflows directly** — call `DataDesigner.create()` / `.preview()` only via `NddAdapter.run_workflow()`. Declaring column configs (`LLMStructuredColumnConfig`, etc.) is fine.
+- **Don't use string literals for column names** — use `COL_*` constants from `engine/constants.py`
+- **Don't add a domain to only one supplement map** — see `engine/rewrite/domain_classification.py` for the sync invariant
+- **Don't hardcode `gliner_threshold`** — it belongs in `Detect` config (default 0.3)
+
+## Development
+
+```bash
+make test          # run all tests
+make bootstrap     # install dev dependencies
+make format        # ruff format + sort imports
+make format-check  # read-only lint check (used in CI)
+make typecheck     # ty type check (advisory)
+make docs-serve    # local MkDocs server at http://127.0.0.1:8000
+```
+
+For contributor workflow and branch naming see [CONTRIBUTING.md](CONTRIBUTING.md).
+For code style and naming conventions see [STYLEGUIDE.md](STYLEGUIDE.md).
@@ -0,0 +1,8 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# Claude Code instructions
+
+Canonical agent instructions live in [AGENTS.md](AGENTS.md).
+
+@AGENTS.md
@@ -208,6 +208,17 @@ The `main` branch has the following protections:
 - All `src` and `tests` files: `@NVIDIA-NeMo/anonymizer-reviewers`
 - All remaining files (`pyproject.toml`, `uv.lock`, `SECURITY.md`, `LICENSE`, `.github/`, etc.): `@NVIDIA-NeMo/anonymizer-maintainers`
 
+### Agent-Assisted Development
+
+When automating edits with coding agents (IDE assistants, CLI tools, or hosted models), follow the standard [Pull Request Process](#pull-request-process) plus these additions:
+
+1. **For non-trivial changes, draft a plan first.** Non-trivial includes: changes spanning more than one of the `config` / `engine` / `interface` subsystems, introducing a new public API, or modifying an invariant called out in [AGENTS.md](AGENTS.md) or [STYLEGUIDE.md](STYLEGUIDE.md).
+   - Write a markdown file detailing the approach, trade-offs considered, affected subsystems, and delivery strategy — enough for reviewers to evaluate the design before implementation begins. (Have the agent draft it; review and refine before submitting.)
+   - Save it at `plans/<issue-number>/<short-name>.md` and submit it as its own PR for review.
+   - Once the plan is approved, implement it in a follow-up PR.
+
+2. **Implement following [AGENTS.md](AGENTS.md) and [STYLEGUIDE.md](STYLEGUIDE.md).** Both capture pipeline structure, naming conventions, and invariants ruff and ty cannot enforce. Implementers — human or agentic — should read these before non-trivial changes.
+
 ## Issues and Discussions
 
 ### Issue Templates

@@ -0,0 +1,236 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# Style Guide
+
+Code and documentation conventions for NeMo Anonymizer that ruff and ty cannot enforce. Architecture boundaries and agent workflow rules live in [AGENTS.md](AGENTS.md). Read before adding a new module, workflow, or config class.
+
+NeMo Anonymizer wraps [DataDesigner](https://github.com/NVIDIA-NeMo/DataDesigner) (NDD) for LLM column generation. References to NDD below mean that library.
+
+For architecture and pipeline identity, see [AGENTS.md](AGENTS.md).
+For contribution workflow and branch naming, see [CONTRIBUTING.md](CONTRIBUTING.md).
+
+---
+
+## Pydantic vs Dataclasses
+
+**Pydantic** for config, validation, and serialization. **Dataclasses** for simple typed containers in the engine.
+
+| Need | Use |
+|------|-----|
+| User-facing config, validation, JSON schema | `BaseModel` |
+| Private/internal frozen value object (e.g. `WorkflowRunResult`, `_RiskToleranceBundle`) | `@dataclass(frozen=True)` |
+
+```python
+# Config — Pydantic
+class Detect(BaseModel):
+    gliner_threshold: float = Field(default=0.3, ge=0.0, le=1.0)
+
+# Internal result — dataclass
+@dataclass(frozen=True)
+class WorkflowRunResult:
+    dataframe: pd.DataFrame
+    failed_records: list[FailedRecord]
+```
+
+Use `Field()` only when you need constraints (`ge`, `le`), descriptions, or `default_factory`. Use bare defaults for simple flags and strings.
+
+---
+
+## Error Handling
+
+Wrap exceptions from NDD and other third-party calls at module boundaries into canonical types from `interface/errors.py`. Callers should never see raw NDD exceptions.
+
+Preserve the traceback:
+
+```python
+# Good
+try:
+    run_results = self._data_designer.create(...)
+except Exception as exc:
+    raise AnonymizerWorkflowError(f"Workflow failed: {exc}") from exc
+
+# Bad — swallows the traceback
+except Exception as exc:
+    raise AnonymizerWorkflowError("Workflow failed")
+```
+
+Don't use defensive `try/except` on trusted internal calls that shouldn't fail — only catch at module boundaries. `RewriteWorkflow._run_final_judge` is the intentional exception: it's explicitly non-critical and catches broadly, logging with `exc_info=True` and substituting safe defaults.
+
+**Error messages** must identify the actual bad value. Use `!r` to make interpolated values unambiguous:
+
+```python
+# Good
+raise ValueError(f"Unsupported strategy: {strategy!r}")
+
+# Bad
+raise ValueError("Invalid strategy")
+```
+
+**No `assert` for validation in production/library code** — `assert` statements are stripped when Python runs with `-O`. Use `if/raise` instead. Pytest assertions in tests are fine.
+
+```python
+# Good
+if not isinstance(config, AnonymizerConfig):
+    raise TypeError(f"Expected AnonymizerConfig, got {type(config)!r}")
+
+# Bad
+assert isinstance(config, AnonymizerConfig)
+```
+
+---
+
+## Column Names
+
+All column names are constants in `engine/constants.py`. Never use string literals for column names.
+
+```python
+# Good
+df[COL_DETECTED_ENTITIES]
+
+# Bad
+df["_detected_entities"]
+```
+
+Internal (intermediate) columns are prefixed with `_`. User-facing output columns use clean names (`final_entities`, `utility_score`). The input text column is always `COL_TEXT` internally and renamed to the user's original column name in `Anonymizer._rename_output_columns()`.
+
+---
+
+## Prompt Construction
+
+**`_jinja(col, key=None)`** from `engine/constants.py` — use for **shared DataFrame column references** in NDD prompt templates. Never format shared column names directly into prompt strings; `_jinja` keeps them grep-able. Local Jinja loop variables (e.g. `entity.value` inside `{% for entity in ... %}`) are scoped to the prompt and don't need `_jinja`.
+
+```python
+# Good
+f"The text is: {_jinja(COL_TEXT)}"
+
+# Bad
+f"The text is: {{{{ {COL_TEXT} }}}}"
+```
+
+**`substitute_placeholders(template, replacements)`** from `engine/prompt_utils.py` — use for dynamic prompt values. The `<<PLACEHOLDER>>` format avoids collisions with Jinja2 syntax. Never use f-strings or `.format()` for prompt templates with dynamic values; single-pass substitution prevents a replacement value from being interpreted as a placeholder.
+
+Prompts live as inline triple-quoted strings in the workflow file that uses them. There is no separate prompt registry.
+
+---
+
+## Type Annotations
+
+Type annotations are required on all functions, methods, and class attributes including tests.
+
+Use `TYPE_CHECKING` blocks for imports needed *only* in type annotations. This prevents circular imports and avoids loading heavy libraries at import time:
+
+```python
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    import pandas as pd
+```
+
+If a module uses `pandas` at runtime — calls `pd.DataFrame`, indexes a DataFrame in a function body, etc. — import it at the top level. A `TYPE_CHECKING` import raises `NameError` if you reference it at runtime. `pandas` is import-time expensive, so keep top-level imports of it limited to modules that genuinely need it.
+
+---
+
+## Import Style
+
+- **ALWAYS** use absolute imports, never relative imports (enforced by `TID`)
+- Place imports at module level, not inside functions (exception: unavoidable for performance reasons)
+- Import sorting is handled by `ruff`'s `isort` — imports should be grouped and sorted:
+  1. Standard library imports
+  2. Third-party imports
+  3. First-party imports (`anonymizer`)
+- Use standard import conventions (enforced by `ICN`)
+
+```python
+# Good
+from anonymizer.config.anonymizer_config import AnonymizerConfig
+
+# Bad - relative import (will cause linter errors)
+from .anonymizer_config import AnonymizerConfig
+
+# Good - imports at module level
+from pathlib import Path
+
+def process_file(filename: str) -> None:
+    path = Path(filename)
+
+# Bad - import inside function
+def process_file(filename: str) -> None:
+    from pathlib import Path
+    path = Path(filename)
+```
+
+---
+
+## Code Organization
+
+- When adding new symbols, prefer public functions and methods before private (`_`-prefixed) ones within a module or class
+- Define helpers at module or class level — avoid nested functions. Nested functions hide logic, make testing harder, and complicate stack traces. The only acceptable use is a closure that genuinely needs to capture local state.
+
+---
+
+## Naming
+
+- Functions and variables: `snake_case`
+- Classes: `PascalCase`
+- Constants: `UPPER_SNAKE_CASE`
+- Function names start with a verb: `run_workflow`, `build_entity_id`, not `entity_id` or `workflow`
+
+---
+
+## Comments
+
+Only add a comment when the WHY is non-obvious — a hidden constraint, a subtle invariant, a workaround for a specific bug. Don't narrate what the code already says:
+
+```python
+# Good — explains a non-obvious invariant
+# uuid5 is deterministic so input/output IDs match for missing-record tracking.
+
+# Bad — narrates what the code does
+# Loop through the records and append to list
+for record in records:
+    results.append(record)
+```
+
+---
+
+## Future Annotations
+
+Every Python file must include `from __future__ import annotations` after the license header. This defers annotation evaluation, enables forward references, and keeps behavior consistent across the codebase.
+
+---
+
+## License Headers
+
+Every Python and Markdown file requires an SPDX header at the top (enforced by `tools/codestyle/copyright_fixer.py --check`, run via `make copyright-check`). Files listed in `.copyrightignore` are exempt.
+
+---
+
+## Docstrings
+
+Google style (`Args:`, `Returns:`, `Raises:`). Public API classes and methods get docstrings; private helpers (`_`-prefixed) only when the logic is non-obvious. Don't restate the signature — explain why or what, not what the type annotation already says.
+
+---
+
+## Design Principles
+
+**DRY**
+
+- Extract shared logic into pure helper functions rather than duplicating across similar call sites
+- Rule of thumb: tolerate duplication until the third occurrence, then extract
+
+**KISS**
+
+- Prefer flat, obvious code over clever abstractions — two similar lines is better than a premature helper
+- When in doubt between DRY and KISS, favor readability over deduplication
+
+**YAGNI**
+
+- Don't add parameters, config, or abstraction layers for hypothetical future use cases
+- Don't generalize until the third caller appears
+
+**SOLID**
+
+- Wrap third-party exceptions at module boundaries — callers depend on canonical error types, not leaked internals
+- Use `Protocol` for contracts between layers
+- One function, one job — separate logic from I/O