Skip to content

Commit eddefaf

Browse files
committed
docs: update README
1 parent 5243f20 commit eddefaf

1 file changed

Lines changed: 74 additions & 18 deletions

File tree

README.md

Lines changed: 74 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,42 @@
11
# PySymBench
2-
Infrastructure for **model comparison and evaluation in symbolic execution workflows**.
32

4-
This project is a **local web application** designed to compare symbolic execution results of an uploaded trained model (in `.onnx` format) on a selected dataset with a **baseline symbolic execution approach (non-AI)**.
3+
Infrastructure for **AI model comparison and evaluation in symbolic execution workflows**.
54

6-
The system uses **PySymGym tools** to run symbolic execution on the dataset and evaluate the results. After execution completes, the results are sent to the **email address you provide**.
5+
PySymBench is a **local web application** for evaluating ONNX models against a non-AI baseline symbolic execution strategy. Experiments run inside Docker using [PySymGym](https://github.com/PySymGym/PySymGym) tools on a fixed dataset; results are emailed back to the user and (when published) saved to a leaderboard.
6+
7+
Three target languages are supported for the dataset: **C#**, **Java**, and **C++**.
78

89
## Features
910

10-
- **Run Experiment** — upload an ONNX model, select test methods from the dataset, and compare it against the baseline strategy. Results (coverage, errors, timing) are delivered to your inbox.
11-
- **Model Ranking** — a public leaderboard of all published experiments, sorted by mean coverage. Shows per-experiment metrics: mean/median coverage, total tests, errors, and runtime.
12-
- **Publish Experiment** — submit a model to the ranking leaderboard. The experiment runs in Docker, computes metrics, and saves the result to the database. Supports cancellation while in progress.
11+
- **Run Experiment** — upload an ONNX model, choose a target language, select methods from the dataset, and compare the model against the baseline strategy. Coverage, errors and timing are emailed to you. Each running task can be cancelled via a one-click link in the confirmation email.
12+
- **Model Ranking** — a leaderboard of all completed experiments per language (with an aggregated view across languages), sorted by mean coverage. Per-experiment metrics include mean/median coverage, total tests, errors, runtime, and coverage percentage.
13+
- **Pairwise Comparison** — pick any two experiments from the ranking and produce side-by-side comparison artifacts (PDFs) downloadable individually or as a single zip.
14+
- **Model Interface docs** — page that describes the ONNX input/output specification required to plug a model into PySymGym.
15+
16+
### Routes
1317

1418
The frontend is a multi-page React SPA using `react-router-dom`:
1519

1620
| Route | Page |
1721
|---|---|
1822
| `/` | Home — navigation hub |
1923
| `/experiment` | Run Experiment form |
20-
| `/ranking` | Model Ranking leaderboard |
21-
| `/ranking/publish` | Publish Experiment form |
24+
| `/ranking` | Model Ranking leaderboard + pairwise comparison |
25+
| `/interface` | Model Interface specification |
26+
27+
### Backend API
28+
29+
| Method | Path | Purpose |
30+
|---|---|---|
31+
| `POST` | `/api/upload` | Submit a new experiment (multipart: ONNX file, `email`, `language`, `experiment`) |
32+
| `GET` | `/api/status/{task_uid}` | Celery task state |
33+
| `POST` | `/api/cancel/{task_uid}` | Cancel a running experiment |
34+
| `GET` | `/api/cancel/{task_uid}?token=...` | One-click cancellation link sent by email |
35+
| `GET` | `/api/ranking?language=csharp\|java\|cpp\|all` | Leaderboard entries |
36+
| `POST` | `/api/compare` | Start a pairwise comparison between two experiment IDs |
37+
| `GET` | `/api/compare/{uid}/status` | Comparison task state and result file list |
38+
| `GET` | `/api/compare/{uid}/file/{name}` | Stream a single comparison artifact |
39+
| `GET` | `/api/compare/{uid}/files.zip` | Download all comparison PDFs as a zip |
2240

2341
# Installation
2442

@@ -35,7 +53,7 @@ EMAIL=your_email@gmail.com
3553
APP_PASSWORD=your_app_password
3654
```
3755

38-
`EMAIL` — your Gmail address
56+
`EMAIL` — your Gmail address
3957
`APP_PASSWORD` — your Gmail **App Password** (not your regular account password)
4058

4159
---
@@ -48,7 +66,7 @@ The ranking leaderboard stores experiment results in a PostgreSQL database. Add
4866
DB_URL=postgresql://user:password@localhost:5432/pysymbench
4967
```
5068

51-
The required table is created automatically on server startup. You can run a local PostgreSQL instance via Docker:
69+
The required tables are created automatically on server startup. You can run a local PostgreSQL instance via Docker:
5270

5371
```
5472
docker run --name postgres-pysymbench -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password \
@@ -57,9 +75,9 @@ docker run --name postgres-pysymbench -e POSTGRES_USER=user -e POSTGRES_PASSWORD
5775

5876
---
5977

60-
## Object Storage (MinIO) — optional
78+
## Object Storage (MinIO)
6179

62-
When publishing experiments to the ranking, the ONNX model and result artifacts can be stored in MinIO. Add the following to your `.env` file:
80+
Experiments store their ONNX model and result artifacts in MinIO; the pairwise comparison feature also reads artifacts from there. MinIO must be reachable — if it is not configured or unavailable, the task fails and the user is notified by email. Add the following to your `.env` file:
6381

6482
```
6583
MINIO_ENDPOINT=localhost:9000
@@ -69,7 +87,7 @@ MINIO_SECURE=false
6987
MINIO_BUCKET=pysymbench
7088
```
7189

72-
If not configured, artifact upload is skipped and only metrics are saved to the database. You can run a local MinIO instance via Docker:
90+
You can run a local MinIO instance via Docker:
7391

7492
```
7593
docker run --name minio -p 9000:9000 -p 9001:9001 \
@@ -91,6 +109,17 @@ All services that connect to Redis — the FastAPI app and every Celery worker
91109

92110
---
93111

112+
## URLs for email links
113+
114+
Cancellation links sent by email are absolute, so the backend needs to know its own public URL and the URL of the frontend. Defaults match a local setup; override them in `.env` if the app is reachable elsewhere:
115+
116+
```
117+
BASE_URL=http://localhost:8000 # base URL of the FastAPI app
118+
FRONTEND_URL=http://localhost:5173 # base URL of the React frontend
119+
```
120+
121+
---
122+
94123
## Backend Setup
95124

96125
1. Install **Python 3.14** and **Docker**, then install the project dependencies:
@@ -111,10 +140,11 @@ python -m backend.launch_service.app_setup
111140
docker run --name redis-for-celery -p 6379:6379 -d redis
112141
```
113142

114-
4. Start the **Celery worker** and the **application server**:
143+
4. Start the **Celery worker** and the **application server** (in separate terminals):
115144

116145
```
117-
celery -A backend.utils.task worker --loglevel=info && uvicorn backend.main:app
146+
celery -A backend.utils.task worker --loglevel=info
147+
uvicorn backend.main:app
118148
```
119149

120150
---
@@ -128,7 +158,6 @@ celery -A backend.utils.task worker --loglevel=info && uvicorn backend.main:app
128158
```
129159
cd frontend
130160
npm install
131-
npm install react-router-dom @types/react-router-dom
132161
```
133162

134163
3. Start the frontend development server:
@@ -148,7 +177,34 @@ npm run build
148177
| Package | Purpose |
149178
|---|---|
150179
| `react-router-dom` | Client-side routing between pages |
151-
| `@types/react-router-dom` | TypeScript types for react-router-dom |
152-
| `antd` | UI component library (forms, tables, buttons) |
180+
| `antd` | UI component library (forms, tables, buttons, modals) |
153181
| `tailwindcss` | Utility-first CSS framework |
154182
| `vite` | Build tool and dev server |
183+
184+
---
185+
186+
## Development
187+
188+
### Python
189+
190+
```
191+
ruff check . # Lint
192+
ruff check . --fix # Auto-fix
193+
ruff format . # Format
194+
pytest -v # Run tests
195+
```
196+
197+
### Frontend
198+
199+
```
200+
cd frontend
201+
npm run lint:fix # ESLint auto-fix
202+
npm run format # Prettier format
203+
npm run format:check # Check formatting without writing
204+
```
205+
206+
## CI/CD
207+
208+
GitHub Actions runs on push/PR:
209+
- **`linting.yml`** — ruff check + format, ESLint + Prettier
210+
- **`build_and_test.yml`** — builds Docker image, runs `pytest -v`

0 commit comments

Comments
 (0)