Skip to content

Commit ddb0be5

Browse files
author
SentienceDEV
committed
add specs
1 parent 5b2c68b commit ddb0be5

File tree

4 files changed

+687
-0
lines changed

4 files changed

+687
-0
lines changed

spec/README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Sentience API Specification
2+
3+
This directory contains the **single source of truth** for the API contract between the Chrome extension and SDKs.
4+
5+
## Files
6+
7+
- **`snapshot.schema.json`** - JSON Schema for snapshot response validation
8+
- **`SNAPSHOT_V1.md`** - Human-readable snapshot API contract
9+
- **`sdk-types.md`** - SDK-level type definitions (ActionResult, WaitResult, TraceStep)
10+
11+
## Purpose
12+
13+
These specifications ensure:
14+
1. **Consistency**: Both Python and TypeScript SDKs implement the same contract
15+
2. **Validation**: SDKs can validate extension responses
16+
3. **Type Safety**: Strong typing in both languages
17+
4. **Documentation**: Clear reference for developers
18+
19+
## Usage
20+
21+
### For SDK Developers
22+
23+
1. **Read** `SNAPSHOT_V1.md` for human-readable contract
24+
2. **Use** `snapshot.schema.json` for JSON Schema validation
25+
3. **Reference** `sdk-types.md` for SDK-level types
26+
27+
### For Extension Developers
28+
29+
1. **Ensure** extension output matches `snapshot.schema.json`
30+
2. **Update** schema when adding new fields
31+
3. **Version** schema for breaking changes
32+
33+
## Versioning
34+
35+
- **v1.0.0**: Initial stable version (Day 1)
36+
- Future versions: Increment major version for breaking changes
37+
- SDKs should validate version and handle compatibility
38+
39+
## Validation
40+
41+
Both SDKs should validate extension responses:
42+
43+
**Python**:
44+
```python
45+
import jsonschema
46+
from spec.snapshot.schema import load_schema
47+
48+
schema = load_schema()
49+
jsonschema.validate(snapshot_data, schema)
50+
```
51+
52+
**TypeScript**:
53+
```typescript
54+
import Ajv from 'ajv';
55+
import schema from './spec/snapshot.schema.json';
56+
57+
const ajv = new Ajv();
58+
const validate = ajv.compile(schema);
59+
validate(snapshot_data);
60+
```
61+
62+
## Testing
63+
64+
- Validate against real extension output
65+
- Test with edge cases (empty pages, many elements, errors)
66+
- Verify type coercion and defaults
67+
68+
---
69+
70+
**Last Updated**: Day 1 Implementation
71+
**Status**: ✅ Stable
72+

spec/SNAPSHOT_V1.md

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
# Sentience Snapshot API Contract v1
2+
3+
**Version**: 1.0.0
4+
**Last Updated**: [Current Date]
5+
**Status**: Stable
6+
7+
This document defines the **single source of truth** for the snapshot data structure returned by `window.sentience.snapshot()`. Both Python and TypeScript SDKs must implement this contract exactly.
8+
9+
## Overview
10+
11+
The snapshot API returns a structured representation of the current page state, including:
12+
- All interactive elements with semantic roles
13+
- Element positions (bounding boxes)
14+
- Importance scores (AI-optimized ranking)
15+
- Visual cues (primary actions, colors, clickability)
16+
- Optional screenshot
17+
18+
## Response Structure
19+
20+
### Top-Level Object
21+
22+
```typescript
23+
{
24+
status: "success" | "error",
25+
timestamp?: string, // ISO 8601
26+
url: string,
27+
viewport?: { width: number, height: number },
28+
elements: Element[],
29+
screenshot?: string, // Base64 data URL
30+
screenshot_format?: "png" | "jpeg",
31+
error?: string, // If status is "error"
32+
requires_license?: boolean // If license required
33+
}
34+
```
35+
36+
### Element Object
37+
38+
```typescript
39+
{
40+
id: number, // REQUIRED: Unique identifier (registry index)
41+
role: string, // REQUIRED: Semantic role
42+
text: string | null, // Text content, aria-label, or placeholder
43+
importance: number, // REQUIRED: Importance score (-300 to ~1800)
44+
bbox: BBox, // REQUIRED: Bounding box
45+
visual_cues: VisualCues, // REQUIRED: Visual analysis
46+
in_viewport: boolean, // Is element visible in viewport
47+
is_occluded: boolean, // Is element covered by overlay
48+
z_index: number // CSS z-index (0 if auto)
49+
}
50+
```
51+
52+
### BBox (Bounding Box)
53+
54+
```typescript
55+
{
56+
x: number, // Left edge in pixels
57+
y: number, // Top edge in pixels
58+
width: number, // Width in pixels
59+
height: number // Height in pixels
60+
}
61+
```
62+
63+
### VisualCues
64+
65+
```typescript
66+
{
67+
is_primary: boolean, // Visually prominent primary action
68+
background_color_name: string | null, // Named color from palette
69+
is_clickable: boolean // Has pointer cursor or actionable role
70+
}
71+
```
72+
73+
## Field Details
74+
75+
### `id` (required)
76+
- **Type**: `integer`
77+
- **Description**: Unique element identifier, corresponds to index in `window.sentience_registry`
78+
- **Usage**: Used for actions like `click(id)`
79+
- **Stability**: May change between page loads (not persistent)
80+
81+
### `role` (required)
82+
- **Type**: `string`
83+
- **Values**: `"button"`, `"link"`, `"textbox"`, `"searchbox"`, `"checkbox"`, `"radio"`, `"combobox"`, `"image"`, `"generic"`
84+
- **Description**: Semantic role inferred from HTML tag, ARIA attributes, and context
85+
- **Usage**: Primary filter for query engine
86+
87+
### `text` (optional)
88+
- **Type**: `string | null`
89+
- **Description**: Text content extracted from element:
90+
- `aria-label` if present
91+
- `value` or `placeholder` for inputs
92+
- `alt` for images
93+
- `innerText` for other elements (truncated to 100 chars)
94+
- **Usage**: Text matching in query engine
95+
96+
### `importance` (required)
97+
- **Type**: `integer`
98+
- **Range**: -300 to ~1800
99+
- **Description**: AI-optimized importance score calculated from:
100+
- Role priority (inputs: 1000, buttons: 500, links: 100)
101+
- Area score (larger elements score higher, capped at 200)
102+
- Visual prominence (+200 for primary actions)
103+
- Viewport/occlusion penalties (-500 off-screen, -800 occluded)
104+
- **Usage**: Ranking and filtering elements
105+
106+
### `bbox` (required)
107+
- **Type**: `BBox` object
108+
- **Description**: Element position and size in viewport coordinates
109+
- **Coordinates**: Relative to viewport (0,0) at top-left
110+
- **Usage**: Spatial queries, visual grounding, click coordinates
111+
112+
### `visual_cues` (required)
113+
- **Type**: `VisualCues` object
114+
- **Description**: Visual analysis results
115+
- **Fields**:
116+
- `is_primary`: True if element is visually prominent primary action
117+
- `background_color_name`: Nearest named color (32-color palette) or null
118+
- `is_clickable`: True if element has pointer cursor or actionable role
119+
120+
### `in_viewport` (optional)
121+
- **Type**: `boolean`
122+
- **Description**: True if element is visible in current viewport
123+
- **Default**: `true` (if not present, assume visible)
124+
125+
### `is_occluded` (optional)
126+
- **Type**: `boolean`
127+
- **Description**: True if element is covered by another element
128+
- **Default**: `false` (if not present, assume not occluded)
129+
130+
### `z_index` (optional)
131+
- **Type**: `integer`
132+
- **Description**: CSS z-index value (0 if "auto" or not set)
133+
- **Default**: `0`
134+
135+
## Element Sorting
136+
137+
Elements in the `elements` array are sorted by:
138+
1. **Primary sort**: `importance` (descending) - most important first
139+
2. **Secondary sort**: `bbox.y` (ascending) - top-to-bottom reading order (if limit applied)
140+
141+
## Example Response
142+
143+
```json
144+
{
145+
"status": "success",
146+
"timestamp": "2025-01-20T10:30:00Z",
147+
"url": "https://example.com",
148+
"viewport": {
149+
"width": 1280,
150+
"height": 800
151+
},
152+
"elements": [
153+
{
154+
"id": 42,
155+
"role": "button",
156+
"text": "Sign In",
157+
"importance": 850,
158+
"bbox": {
159+
"x": 100,
160+
"y": 200,
161+
"width": 120,
162+
"height": 40
163+
},
164+
"visual_cues": {
165+
"is_primary": true,
166+
"background_color_name": "blue",
167+
"is_clickable": true
168+
},
169+
"in_viewport": true,
170+
"is_occluded": false,
171+
"z_index": 0
172+
}
173+
]
174+
}
175+
```
176+
177+
## Error Response
178+
179+
```json
180+
{
181+
"status": "error",
182+
"error": "Headless mode requires a valid license key...",
183+
"requires_license": true
184+
}
185+
```
186+
187+
## SDK Implementation Requirements
188+
189+
Both Python and TypeScript SDKs must:
190+
191+
1. **Validate** snapshot response against this schema
192+
2. **Parse** all required fields correctly
193+
3. **Handle** optional fields gracefully (defaults)
194+
4. **Type-check** all fields (Pydantic for Python, TypeScript types for TS)
195+
5. **Preserve** field names exactly (no renaming)
196+
197+
## Versioning
198+
199+
- **v1.0.0**: Initial stable version
200+
- Future versions will increment major version for breaking changes
201+
- SDKs should validate version and handle compatibility
202+
203+
## Related Documents
204+
205+
- `snapshot.schema.json` - JSON Schema validation
206+
- Extension implementation: `sentience-chrome/injected_api.js`
207+
- WASM implementation: `sentience-chrome/src/lib.rs`
208+

0 commit comments

Comments
 (0)