Skip to content

Commit dffb8a4

Browse files
README updated
1 parent b71ecde commit dffb8a4

4 files changed

Lines changed: 100 additions & 3 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,14 +42,14 @@ The package doesn't have the dataset, it is stored on our [HuggingFace page](htt
4242

4343
## Latest News 📣
4444

45+
* [2026/03] Fully functional CLI commands for inference and evaluation. See this [guide](./llmsql/_cli/README.md).
46+
4547
* [2026/03] Added support for API inference, for now only for OpenAI-compatable APIs, see [`inference_api()` function](./llmsql/inference/inference_api.py#inference_api)
4648

4749
* [2026/03] The page now contains first version of [leaderboard](https://llmsql.github.io/llmsql-benchmark/#:~:text=%F0%9F%93%8A%20Leaderboard%20%E2%80%94%20Execution%20Accuracy%20%28EX)!
4850

4951
* [2026/02] The new LLMSQL 2.0 version is out now! See the [dataset](https://huggingface.co/datasets/llmsql-bench/llmsql-2.0). The support is already added with the `version` parameter to each `inference` function.
5052

51-
* [2025/12] Evaluation class converted to function see [new `evaluate(...)` function](./llmsql/evaluation/evaluate.py#evaluate)
52-
5353

5454

5555
## Usage Recommendations

examples/test_output.jsonl

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{"question_id": 15, "completion": "SELECT \"Launched\" FROM \"Table\" WHERE \"Laid down\" = \"September 1, 1964\";"}
2+
{"question_id": 48, "completion": "SELECT \"Tone (Latvian notation: /~/ - level, /^/ - broken)\" FROM \"Table\" WHERE \"Translation\" = \"Gen.Sing. plague\";"}
3+
{"question_id": 4, "completion": "SELECT COUNT(\"School/Club Team\") FROM \"Table\" WHERE \"Player\" = \"Jalen Rose\";"}
4+
{"question_id": 21, "completion": "SELECT \"US Air Force equivalent\" FROM \"Table\" WHERE \"Commonwealth equivalent\" = \"Pilot Officer\";"}
5+
{"question_id": 31, "completion": "SELECT COUNT(\"Total W–L\") FROM \"Table\" WHERE \"Doubles W–L\" = \"11–11\";"}
6+
{"question_id": 42, "completion": "SELECT \"BTs retail price (regulated)\" FROM \"Table\" WHERE \"Tariff code\" = \"ff0 PRS\";"}
7+
{"question_id": 7, "completion": "SELECT \"Date\" FROM \"Table\" WHERE \"Circuit\" = \"Misano\";"}
8+
{"question_id": 28, "completion": "SELECT \"Rank Each wrestlers total number of days as champion are ranked highest to lowest; wrestlers with the same number mean that they are tied for that certain rank.\" FROM \"Table\" WHERE \"Wrestler\" = \"Go Shiozaki\";"}
9+
{"question_id": 36, "completion": "SELECT \"Frequency\" FROM \"Table\" WHERE \"Market/Rank\" = \"Burlington - Plattsburgh , Vermont - New York /143\";"}
10+
{"question_id": 23, "completion": "SELECT \"Rank in Spanish\" FROM \"Table\" WHERE \"Rank in English\" = \"Major\";"}
11+
{"question_id": 11, "completion": "SELECT COUNT(DISTINCT \"Nationality\") FROM \"Table\" WHERE \"NHL team\" = \"New Jersey Devils\";"}
12+
{"question_id": 47, "completion": "SELECT \"BTs retail price (regulated)\" FROM \"Table\" WHERE \"Tariff code\" = \"g10\";"}
13+
{"question_id": 12, "completion": "SELECT \"Pick\" FROM \"Table\" WHERE \"Player\" = \"Dorain Anneck\";"}
14+
{"question_id": 16, "completion": "SELECT \"#\" FROM \"Table\" WHERE \"Commissioned\" = \"December 18, 1965\";"}
15+
{"question_id": 27, "completion": "SELECT \"Combined days\" FROM \"Table\" WHERE \"Wrestler\" = \"Go Shiozaki\";"}
16+
{"question_id": 32, "completion": "SELECT COUNT(\"Singles W–L\") FROM \"Table\" WHERE \"Doubles W–L\" = \"11–14\";"}
17+
{"question_id": 22, "completion": "SELECT \"Commonwealth equivalent\" FROM \"Table\" WHERE \"US Air Force equivalent\" = \"Major General\";"}
18+
{"question_id": 43, "completion": "SELECT \"Approx premium\" FROM \"Table\" WHERE \"Tariff code\" = \"g9\";"}
19+
{"question_id": 49, "completion": "SELECT MIN(\"Radius (R ☉ )\") FROM \"Table\";"}
20+
{"question_id": 34, "completion": "SELECT MAX(\"Ties played\") FROM \"Table\" WHERE \"Player\" = \"Josip Palada Category:Articles with hCards\";"}

llmsql/_cli/README.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# LLMSQL CLI Manual
2+
3+
The `llmsql` CLI provides two main workflows:
4+
5+
- `inference`: generate SQL predictions with a selected backend.
6+
- `evaluate`: score predictions on the LLMSQL benchmark.
7+
8+
## Command Structure
9+
10+
```bash
11+
llmsql <command> [options]
12+
```
13+
14+
Available commands:
15+
16+
- `llmsql inference transformers ...`
17+
- `llmsql inference vllm ...`
18+
- `llmsql inference api ...`
19+
- `llmsql evaluate ...`
20+
21+
## Inference Commands
22+
23+
### 1) Transformers backend
24+
25+
```bash
26+
llmsql inference transformers \
27+
--model-or-model-name-or-path Qwen/Qwen2.5-1.5B-Instruct \
28+
--output-file outputs/preds_transformers.jsonl
29+
```
30+
31+
This command calls [`inference_transformers()`](../inference/inference_transformers.py).
32+
33+
### 2) vLLM backend
34+
35+
```bash
36+
llmsql inference vllm \
37+
--model-name Qwen/Qwen2.5-1.5B-Instruct \
38+
--output-file outputs/preds_vllm.jsonl
39+
```
40+
41+
This command calls [`inference_vllm()`](../inference/inference_vllm.py).
42+
43+
### 3) OpenAI-compatible API backend
44+
45+
```bash
46+
llmsql inference api \
47+
--model-name gpt-5-mini \
48+
--base-url https://api.openai.com/v1 \
49+
--output-file outputs/preds_api.jsonl
50+
```
51+
52+
This command calls [`inference_api()`](../inference/inference_api.py).
53+
54+
## Evaluation Command
55+
56+
```bash
57+
llmsql evaluate --outputs outputs/preds_transformers.jsonl
58+
```
59+
60+
This command calls [`evaluate()`](../evaluation/evaluate.py).
61+
62+
## Help
63+
64+
Use built-in help to see all options:
65+
66+
```bash
67+
llmsql --help
68+
llmsql inference --help
69+
llmsql inference transformers --help
70+
llmsql inference vllm --help
71+
llmsql inference api --help
72+
llmsql evaluate --help
73+
```

llmsql/evaluation/evaluate.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,11 @@
1414

1515
from rich.progress import track
1616

17-
from llmsql.config.config import DEFAULT_WORKDIR_PATH, DEFAULT_LLMSQL_VERSION, get_repo_id
17+
from llmsql.config.config import (
18+
DEFAULT_LLMSQL_VERSION,
19+
DEFAULT_WORKDIR_PATH,
20+
get_repo_id,
21+
)
1822
from llmsql.utils.evaluation_utils import (
1923
connect_sqlite,
2024
download_benchmark_file,

0 commit comments

Comments
 (0)