22
33[ English] ( ./README.md ) | [ 简体中文] ( ./README.zh-CN.md )
44
5+ <p align =" center " >
6+ <img src =" ./assets/hero-banner.svg " alt =" Task Bundle hero banner " width =" 100% " />
7+ </p >
8+
9+ <p align =" center " ><strong >Turn AI coding runs into portable, replayable, benchmark-ready task bundles.</strong ></p >
10+ <p align =" center " >The missing middle layer between raw chat logs and heavyweight benchmark platforms.</p >
11+ <p align =" center " >
12+ <a href =" #quickstart " ><strong >Quick Start</strong ></a > ·
13+ <a href =" #real-bundles " ><strong >Real Output</strong ></a > ·
14+ <a href =" ./docs/bundle-format.md " ><strong >Bundle Format</strong ></a > ·
15+ <a href =" ./ROADMAP.md " ><strong >Roadmap</strong ></a > ·
16+ <a href =" ./docs/branding.md " ><strong >Brand Assets</strong ></a >
17+ </p >
18+
519[ ![ CI] ( https://github.com/wimi321/task-bundle/actions/workflows/ci.yml/badge.svg )] ( https://github.com/wimi321/task-bundle/actions/workflows/ci.yml )
620[ ![ GitHub stars] ( https://img.shields.io/github/stars/wimi321/task-bundle?style=social )] ( https://github.com/wimi321/task-bundle/stargazers )
721[ ![ License: MIT] ( https://img.shields.io/badge/License-MIT-yellow.svg )] ( ./LICENSE )
822
9- ![ Task Bundle hero banner] ( ./assets/hero-banner.svg )
10-
11- Turn AI coding runs into portable, replayable, benchmark-ready task bundles.
12-
13- The missing middle layer between raw chat logs and heavyweight benchmark platforms.
23+ Task Bundle is a TypeScript + Node.js CLI for teams building agents, evals, coding benchmarks, and reproducible AI workflows.
1424
1525Package a task once, inspect it later, compare tools on the same starting point, and generate benchmark-style reports from real artifacts.
1626
17- Task Bundle is a TypeScript + Node.js CLI for teams building agents, evals, coding benchmarks, and reproducible AI workflows.
18-
1927Why people star it:
2028- turn one AI coding run into a clean, shareable directory instead of a screenshot, transcript, or loose patch
2129- compare Codex, Claude Code, Cursor, or internal agents with real metadata, hashes, and outcome fields
@@ -38,6 +46,66 @@ It is intentionally not:
3846- a benchmark platform
3947- a token-by-token recorder
4048
49+ <a id =" quickstart " ></a >
50+
51+ ## Quick Start
52+
53+ Run the repo against real example bundles in about a minute:
54+
55+ ``` bash
56+ npm install
57+ npm run build
58+ npm run dev -- compare ./examples/hello-world-bundle ./examples/hello-world-bundle-claude
59+ ```
60+
61+ If you want the shortest possible proof that the project already works, this is it.
62+
63+ <a id =" real-bundles " ></a >
64+
65+ ## See It On Real Bundles
66+
67+ Inspect a bundle:
68+
69+ ``` text
70+ $ npm run dev -- inspect ./examples/hello-world-bundle
71+ Task Bundle
72+ -----------
73+ Title: Fix greeting punctuation
74+ Tool: codex
75+ Model: gpt-5
76+ Status: success
77+ Score: 0.93
78+ Workspace files: 1
79+ Events: 3
80+ ```
81+
82+ Compare two tools on the same task:
83+
84+ ``` text
85+ $ npm run dev -- compare ./examples/hello-world-bundle ./examples/hello-world-bundle-claude
86+ Task Bundle Comparison
87+ ----------------------
88+ Left tool: codex
89+ Right tool: claude-code
90+ Left score: 0.93
91+ Right score: 0.89
92+ Score delta: 0.04
93+ Workspace file delta: 0
94+ Event count delta: -1
95+ ```
96+
97+ Generate a benchmark-style summary from a directory of runs:
98+
99+ ``` text
100+ $ npm run dev -- report ./examples --out ./dist/benchmark-report.md
101+ Bundles: 2
102+ Average score: 0.91
103+
104+ Ranking
105+ 1. Fix greeting punctuation | codex / gpt-5 | success | score 0.93
106+ 2. Fix greeting punctuation | claude-code / claude-sonnet-4 | success | score 0.89
107+ ```
108+
41109## Why It Matters
42110
43111Most AI coding work disappears into screenshots, transcripts, or one-off patches.
0 commit comments