SculptAI · Ki-Seki · Jun 1, 2026 · May 30, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -8,6 +8,7 @@ repos:
       - id: check-case-conflict
       - id: check-toml
       - id: check-yaml
+        args: [--unsafe]
       - id: check-ast
       - id: debug-statements
       - id: check-docstring-first

diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -5,5 +5,8 @@
         "tests"
     ],
     "python.testing.unittestEnabled": false,
-    "python.testing.pytestEnabled": true
+    "python.testing.pytestEnabled": true,
+    "chat.tools.terminal.autoApprove": {
+        "make": true
+    }
 }
diff --git a/README.md b/README.md
@@ -14,96 +14,80 @@
 
 </p>
 
-## Installation
+**Guided Infilling Modeling Toolkit** — structured text generation and information extraction using language models.
 
-Install GIMKit using pip:
+Write a template with typed placeholders. The LLM fills them in. Get structured, named results back.
 
-```bash
-pip install gimkit
-```
+```python
+from gimkit import guide as g
 
-For vLLM support, install with the optional dependency:
+query = f"""Extract from: "Hi, I'm John Smith, reach me at john@gmail.com"
 
-```bash
-pip install gimkit[vllm]
-```
+Name: {g.person_name(name="name")}
+Email: {g.e_mail(name="email")}"""
 
-## Quick Start
+result = model(query, use_gim_prompt=True)
+result.tags["name"].content   # → "John Smith"
+result.tags["email"].content  # → "john@gmail.com"
+```
 
-Here's a simple example using the OpenAI backend:
+## Installation
 
-```python
-from openai import OpenAI
-from gimkit import from_openai, guide as g
+```bash
+pip install gimkit
+```
 
-# Initialize the client and model
-client = OpenAI()  # Uses OPENAI_API_KEY environment variable
-model = from_openai(client, model_name="gpt-4")
+For vLLM support:
 
-# Create a query with masked tags
-result = model(f"Hello, {g(desc='a single word')}!", use_gim_prompt=True)
-print(result)  # Output: Hello, world!
+```bash
+pip install gimkit[vllm]
 ```
 
-## Usage
-
-### Creating Masked Tags: Use the `guide` helper (imported as `g`) to create masked tags
+## What Can You Do With GIMKit?
 
-```python
-from gimkit import guide as g
+GIMKit is a **general-purpose information extraction framework**. Write a natural-language template with embedded tags, and the model extracts structured data from any text.
 
-# Basic tag with description
-tag = g(name="greeting", desc="A friendly greeting")
+| Use Case | Example |
+|----------|---------|
+| **Contact extraction** | Parse names, emails, phones from free-form text |
+| **Named entity recognition** | Extract orgs, people, locations, dates |
+| **Text classification** | Categorize text, assign sentiment labels |
+| **Event extraction** | Pull what/where/when/impact from event descriptions |
+| **Relation extraction** | Find entities and the relationships between them |
+| **Resume parsing** | Extract name, title, education, experience |
+| **Review analysis** | Parse product, price, rating, pros/cons |
 
-# Specialized tags
-name_tag = g.person_name(name="user_name")
-email_tag = g.e_mail(name="email")
-phone_tag = g.phone_number(name="phone")
-word_tag = g.single_word(name="word")
+See the [Classic IE Use Cases](https://sculptai.github.io/GIMKit/use-cases/classic/), [Privacy and PII Use Cases](https://sculptai.github.io/GIMKit/use-cases/privacy-pii/), and [Other Use Cases](https://sculptai.github.io/GIMKit/use-cases/others/) pages for full examples.
 
-# Selection from choices
-choice_tag = g.select(name="color", choices=["red", "green", "blue"])
+## Why GIMKit?
 
-# Tag with regex constraint
-custom_tag = g(name="code", desc="A 4-digit code", regex=r"\d{4}")
-```
+- **Template-driven** — describe what you want in natural language, not label lists
+- **Format control** — regex constraints, enumerated choices, type-safe tags
+- **Named access** — results are keyed by field name, not token positions
+- **Small-model friendly** — works with compact open-source models (4B+)
+- **Multiple backends** — OpenAI, vLLM (server and offline)
 
-### Building Queries: Combine masked tags with text to build queries
+## Quick Start
 
 ```python
-from gimkit import from_openai, guide as g
 from openai import OpenAI
+from gimkit import from_openai, guide as g
 
 client = OpenAI()
 model = from_openai(client, model_name="gpt-4")
 
+# Simple extraction
+result = model(f"Hello, {g(desc='a single word')}!", use_gim_prompt=True)
+print(result)  # Hello, world!
+
+# Structured form
 query = f"""
 Name: {g.person_name(name="name")}
 Email: {g.e_mail(name="email")}
 Favorite color: {g.select(name="color", choices=["red", "green", "blue"])}
 """
-
-result = model(query, use_gim_prompt=True)
-print(result)
-```
-
-### Accessing Results: Access filled tags from the result
-
-```python
 result = model(query, use_gim_prompt=True)
-
-# Iterate over all tags
-for tag in result.tags:
-    print(f"{tag.name}: {tag.content}")
-
-# Access by name
 print(result.tags["name"].content)
-
-# Modify tag content
-result.tags["email"].content = "REDACTED"
+print(result.tags["email"].content)
+print(result.tags["color"].content)
 ```
-
-## Design Philosophy
-
-- Stable over feature
-- Small open-source model first
diff --git a/docs/api.zh.md b/docs/api.zh.md
@@ -0,0 +1,35 @@
+# API 参考
+
+本页由源代码中的 docstring 通过 `mkdocstrings` 生成。页面上的说明文字使用中文，具体的对象文档仍来自代码中的原始注释。
+
+## 包
+
+::: gimkit
+
+## 核心模块
+
+::: gimkit.guides
+
+::: gimkit.schemas
+
+::: gimkit.contexts
+
+::: gimkit.dsls
+
+::: gimkit.prompts
+
+::: gimkit.log
+
+::: gimkit.exceptions
+
+## 模型后端
+
+::: gimkit.models.base
+
+::: gimkit.models.openai
+
+::: gimkit.models.vllm
+
+::: gimkit.models.vllm_offline
+
+::: gimkit.models.utils
diff --git a/docs/index.md b/docs/index.md
@@ -1,6 +1,6 @@
 # GIMKit
 
-**Guided Infilling Modeling Toolkit** — precise structured text generation using language models.
+**Guided Infilling Modeling Toolkit** — structured text generation and information extraction using language models.
 
 GIMKit lets you define placeholders (masked tags) in text and have a language model fill them in. It gives you fine-grained control over model outputs through a typed tag system with optional regex constraints.
 
@@ -10,15 +10,29 @@ GIMKit lets you define placeholders (masked tags) in text and have a language mo
 
 ---
 
+## What Can You Do With GIMKit?
+
+GIMKit is a **general-purpose information extraction framework**. Write a natural-language template with embedded typed placeholders, and the model extracts structured data from any unstructured text.
+
+| Use Case | Description |
+|----------|-------------|
+| **Contact extraction** | Parse names, emails, phone numbers from free-form text |
+| **Named entity recognition** | Extract organizations, people, locations, dates |
+| **Text classification** | Categorize text into labels, assign sentiment |
+| **Event extraction** | Pull structured event info (what/where/when/impact) |
+| **Relation extraction** | Find entities and the relationships between them |
+| **Resume / CV parsing** | Extract candidate name, title, education, experience |
+| **Product review analysis** | Parse product, price, rating, pros and cons |
+| **Privacy & PII protection** | Extract, classify, redact, and filter PII |
+
+See the [Classic IE Use Cases](use-cases/classic.md), [Privacy and PII Use Cases](use-cases/privacy-pii.md), and [Other Use Cases](use-cases/others.md) pages for full code examples.
+
+---
+
 ## Features
 
 - **Masked tag system** — embed typed placeholders directly in f-strings.
 - **Regex constraints** — restrict model output to specific patterns.
 - **Named access** — retrieve results by tag name or index.
 - **Multiple backends** — OpenAI, vLLM (server and offline).
 - **Small-model friendly** — designed to work well with compact open-source models.
-
-## Design Philosophy
-
-- **Stable over feature** — reliability and correctness are prioritized above new features.
-- **Small open-source model first** — designed to work well with small, freely available language models.
diff --git a/docs/index.zh.md b/docs/index.zh.md
@@ -0,0 +1,38 @@
+# GIMKit
+
+**Guided Infilling Modeling Toolkit** — 基于语言模型的结构化文本生成与信息抽取工具。
+
+GIMKit 允许你在文本中定义占位符（masked tags），由语言模型来填充。通过类型化的标签系统和可选的正则约束，实现对模型输出的精细控制。
+
+[![PyPI Version](https://img.shields.io/pypi/v/gimkit?label=pypi%20package)](https://pypi.org/project/gimkit)
+[![Python Versions](https://img.shields.io/pypi/pyversions/gimkit.svg)](https://pypi.org/project/gimkit)
+[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey)](https://pypi.org/project/gimkit)
+
+---
+
+## GIMKit 能做什么？
+
+GIMKit 是一个**通用信息抽取框架**。用自然语言写一个模板，嵌入类型化的占位符，模型就能从任意非结构化文本中提取结构化数据。
+
+| 应用场景 | 说明 |
+|----------|------|
+| **联系人提取** | 从自由文本中解析姓名、邮箱、电话 |
+| **命名实体识别** | 提取组织、人物、地点、日期 |
+| **文本分类** | 对文本进行分类、情感标注 |
+| **事件抽取** | 提取结构化事件信息（何事/何地/何时/影响） |
+| **关系抽取** | 发现实体及其之间的关系 |
+| **简历解析** | 提取候选人姓名、职位、学历、经验 |
+| **评论分析** | 解析产品名、价格、评分、优缺点 |
+| **隐私与 PII 保护** | 提取、分类、脱敏和过滤个人信息 |
+
+完整代码示例见 [经典信息抽取案例](use-cases/classic.zh.md)、[隐私与 PII 案例](use-cases/privacy-pii.zh.md) 和 [其他应用案例](use-cases/others.zh.md) 页面。
+
+---
+
+## 特性
+
+- **标签系统** — 直接在 f-string 中嵌入类型化占位符。
+- **正则约束** — 将模型输出限制为特定模式。
+- **按名访问** — 通过标签名或索引获取结果。
+- **多后端支持** — OpenAI、vLLM（服务端和离线模式）。
+- **小模型友好** — 专为小型开源模型设计。
diff --git a/docs/installation.zh.md b/docs/installation.zh.md
@@ -0,0 +1,25 @@
+# 安装
+
+## 标准安装
+
+使用 pip 安装 GIMKit：
+
+```bash
+pip install gimkit
+```
+
+## 支持 vLLM
+
+安装时附带可选的 `vllm` 依赖以启用 vLLM 后端：
+
+```bash
+pip install gimkit[vllm]
+```
+
+!!! note
+    vLLM 仅支持 Linux。在 Windows 和 macOS 上请省略 `[vllm]` 选项。
+
+## 系统要求
+
+- Python 3.10 或更高版本
+- Linux、macOS 或 Windows
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -0,0 +1,50 @@
+# Model Usage Overview
+
+This page compares supported clients and explains when to use each mode.
+
+## Client Comparison
+
+| Client | Constructor | Best for |
+|---|---|---|
+| OpenAI | `from_openai(client, model_name=...)` | Hosted OpenAI-compatible APIs |
+| vLLM (Server) | `from_vllm(client, model_name=...)` | OpenAI-compatible vLLM HTTP server |
+| vLLM (Offline) | `from_vllm_offline(llm)` | Local offline inference with `vllm.LLM` |
+
+## Support Matrix
+
+| Capability | OpenAI | vLLM (Server) | vLLM (Offline) |
+|---|---|---|---|
+| `use_gim_prompt=True` | Recommended | Only for non-GIM models | Only for non-GIM models |
+| `output_type=None` | Fallback when JSON is unsupported | Available but not recommended | Available but not recommended |
+| `output_type="cfg"` | Not available | Recommended | Recommended |
+| `output_type="json"` | Yes | Yes | Yes |
+
+## Initialization Differences
+
+- OpenAI and vLLM server mode both take an OpenAI-compatible client object.
+- vLLM offline mode takes a `vllm.LLM` instance, not an OpenAI client.
+- For vLLM server mode, create the client with `base_url` pointing to your server.
+
+## Prompt Usage Recommendation
+
+- Most local workflows use GIM-trained models: `use_gim_prompt=False` is preferred.
+- For non-GIM-trained models, enable `use_gim_prompt=True`.
+- For OpenAI paths, prefer `use_gim_prompt=True`.
+
+## Output Type Guide
+
+### OpenAI
+
+- Prefer `output_type="json"`.
+- If your OpenAI provider does not support JSON constraints, use `output_type=None`.
+
+### vLLM (Server / Offline)
+
+- Prefer `output_type="cfg"` for both GIM-trained and non-GIM models.
+- `output_type="json"` is available when JSON output is specifically needed.
+
+## Common Optional Flags
+
+- `include_grammar=True`: include grammar text in query input.
+- `backend`: choose Outlines backend implementation.
+- `**inference_kwargs`: pass generation parameters to the underlying backend.