Added all the files required to run task generation agentic pipeline by saidul-islam98 · Pull Request #59 · VectorInstitute/automated_capability_evaluation

saidul-islam98 · 2026-02-06T21:35:53Z

PR Type

Feature

Short Description

Added the required files to run the task generation agentic pipeline. The instructions to run the pipeline can be found under:

src/task_generation/Instructions.md

Tests Added

None

kohankhaki

Thanks for the new agentic work. I have one question, is this going to replace the stage 3 in agentic workflow? Or introduces an alternative for the stage 3?

From the current repo structure, we have two paths:

The schema-standard base pipeline (src/run_base_pipeline.py + src/base_stages/*, documented in src/schemas/GENERATION_PIPELINE_SCHEMAS.md).
A legacy agentic debate path (src/agentic_* entrypoints) that uses older custom JSON structures.

For this PR, since the goal is to make the new agentic Stage 3 replaceable with the current base Stage 3, the output contract needs to match the standardized schema exactly (Stage-3 layout, metadata linkage, hierarchy fields, ID conventions), so Stage 4/5 can consume it without special handling.

Stage 3 should consume capabilities/<capabilities_tag>/<area_id>/capabilities.json in the standardized format.
To test this Stage-3 implementation, please use schema-standard Stage-2 inputs:

either generate areas/capabilities via the standard pipeline (stages 0–2), or
create custom areas/capabilities using the standard dataclasses (Domain, Area, Capability) and save_capabilities.

kohankhaki · 2026-02-11T23:14:34Z

src/task_generation/runner.py

+        )
+
+        # chapter path for output
+        chapter_out_path = (


Output path is non-standard (tasks/<tag>/<book>/<chapter>/tasks.json). Standard schema requires tasks/<task_tag>/<area_id>/<capability_id>/tasks.json. Please align path structure to schema contract.

kohankhaki · 2026-02-11T23:15:21Z

src/task_generation/runner.py

+    def make_verifier_agent() -> VerifierAgent:
+        return VerifierAgent(name="Verifier", model_client=verifier_client)
+
+    for chapter_idx, chapter_path in enumerate(chapter_files):


Stage-3 standard flow should iterate over Stage-2 capabilities (area/capability hierarchy), not chapter files directly. This currently bypasses standardized Stage-2 -> Stage-3 contract and breaks immediate Stage-4/5 interoperability.

kohankhaki · 2026-02-11T23:15:45Z

src/task_generation/runner.py

+                dedup_cfg.get("embedding_model", "text-embedding-3-small")
+            )
+            keep_policy = str(dedup_cfg.get("keep_policy", "first"))
+            cache_embeddings = bool(dedup_cfg.get("cache_embeddings", True))


input_stage_tag is set to None. For Stage-3 this must reference the Stage-2 capabilities tag for provenance/resume compatibility. Please pass the actual input tag.

kohankhaki · 2026-02-11T23:23:03Z

src/task_generation/runner.py

+            )
+            continue
+
+        all_tasks: List[Task] = []


Tasks are being built with placeholder capability/area/domain fields (__placeholder__, *_placeholder).
Schema-compliant Task objects must include real hierarchy values from actual Capability inputs (capability/area/domain identifiers and names), not placeholders.

kohankhaki · 2026-02-11T23:25:32Z

src/task_generation/dedup_utils.py

+        meta["chapter_id"] = meta.get("chapter_id") or chapter_id
+        t.generation_metadata = meta
+
+        t.task_id = f"{prefix}__task_{i:03d}"


Dedup rewrites task_id to <chapter_id>__task_###, which breaks standardized task ID format and scope expectations.
Please keep/assign schema-standard task_### IDs in capability scope after deduplication as well.

Added all the files required to run task generation agentic pipeline

8d3810f

saidul-islam98 requested review from afkanpour and kohankhaki February 6, 2026 21:35

kohankhaki requested changes Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added all the files required to run task generation agentic pipeline#59

Added all the files required to run task generation agentic pipeline#59
saidul-islam98 wants to merge 1 commit intomainfrom
agentic_task_gen_pipeline_v1

saidul-islam98 commented Feb 6, 2026

Uh oh!

kohankhaki left a comment

Uh oh!

kohankhaki Feb 11, 2026

Uh oh!

kohankhaki Feb 11, 2026

Uh oh!

kohankhaki Feb 11, 2026

Uh oh!

kohankhaki Feb 11, 2026

Uh oh!

kohankhaki Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

saidul-islam98 commented Feb 6, 2026

PR Type

Short Description

Tests Added

Uh oh!

kohankhaki left a comment

Choose a reason for hiding this comment

Uh oh!

kohankhaki Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants