Skip to content

Commit ff05e72

Browse files
sjarmakclaude
andcommitted
feat: SDLC variance 150/150 — promote 3 final MCP code review runs
All 150 SDLC tasks now have 3+ paired passes (baseline + MCP). The last 3 gap tasks (calcom/curl/terraform code review) succeeded after removing _install_sourcegraph_skill() which was causing AgentSetupTimeoutError on Daytona. Results: calcom=0.35, curl=0.72, terraform=0.44 (MCP rewards) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 6778aca commit ff05e72

File tree

1,869 files changed

+1796424
-143193
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,869 files changed

+1796424
-143193
lines changed

docs/official_results/README.md

Lines changed: 195 additions & 62 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_build_haiku_20260227_034711--baseline-local-direct--servo-scrollend-event-feat-001.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4428,7 +4428,7 @@
44284428
},
44294429
"provenance": {
44304430
"checksums": {
4431-
"result_json_sha256": "e75eb5e3a5984ff24f36c2ba50b5fa83d5c5b840c7dff6f5a3acc464a1673c25",
4431+
"result_json_sha256": "121cf9077e3867257eaf100fa009cd02f89cbbc8991bf23bbf531121787065c8",
44324432
"trajectory_sha256": "0c7bb4696c2fd3e08f76799c0a923ec7247b4d9da8c802ce18a39c144738eba9",
44334433
"transcript_sha256": "9afc4e9aa750ded2ad65783d258af72d2eff859a1b5191cbe0eae4dc14f47c2a"
44344434
},
@@ -4444,8 +4444,8 @@
44444444
}
44454445
},
44464446
"score": {
4447-
"reward": 0.0,
4448-
"status": "failed",
4447+
"reward": 0.5,
4448+
"status": "passed",
44494449
"timed_out": false
44504450
}
44514451
}

docs/official_results/audits/ccb_debug_haiku_20260301_230240--baseline-local-direct--teleport-ssh-regression-prove-001.json

Lines changed: 3562 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260301_230240--baseline-local-direct--tutanota-search-regression-prove-001.json

Lines changed: 2709 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260301_230240--mcp-remote-direct--mcp_teleport-ssh-regression-prove-001_pyxplw.json

Lines changed: 1783 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260301_230240--mcp-remote-direct--mcp_tutanota-search-regression-prove-001_yb9gom.json

Lines changed: 1626 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260302_004746--baseline-local-direct--teleport-ssh-regression-prove-001.json

Lines changed: 4973 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260302_004746--baseline-local-direct--tutanota-search-regression-prove-001.json

Lines changed: 3116 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260302_004746--mcp-remote-direct--mcp_teleport-ssh-regression-prove-001_fzxnag.json

Lines changed: 1914 additions & 0 deletions
Large diffs are not rendered by default.

docs/official_results/audits/ccb_debug_haiku_20260302_004746--mcp-remote-direct--mcp_tutanota-search-regression-prove-001_s03ep4.json

Lines changed: 2417 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)