Merge pull request #57 from SentienceAPI/refactor_phase1.3_1.4

rcholic · web-flow · commit 127f610f1ee0 · 2025-12-26T08:38:44.000-08:00
Phase 1.3 - 1.4: updated readme, SDK manual
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,118 @@
+# Changelog
+
+All notable changes to the Sentience Python SDK will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [0.12.0] - 2025-12-26
+
+### Added
+
+#### Agent Tracing & Debugging
+- **New Module: `sentience.tracing`** - Built-in tracing infrastructure for debugging and analyzing agent behavior
+  - `Tracer` class for recording agent execution
+  - `TraceSink` abstract base class for custom trace storage
+  - `JsonlTraceSink` for saving traces to JSONL files
+  - `TraceEvent` dataclass for structured trace events
+  - Trace events: `step_start`, `snapshot`, `llm_query`, `action`, `step_end`, `error`
+- **New Module: `sentience.agent_config`** - Centralized agent configuration
+  - `AgentConfig` dataclass with defaults for snapshot limits, LLM settings, screenshot options
+- **New Module: `sentience.utils`** - Snapshot digest utilities
+  - `compute_snapshot_digests()` - Generate SHA256 fingerprints for loop detection
+  - `canonical_snapshot_strict()` - Digest including element text
+  - `canonical_snapshot_loose()` - Digest excluding text (layout only)
+  - `sha256_digest()` - Hash computation helper
+- **New Module: `sentience.formatting`** - LLM prompt formatting
+  - `format_snapshot_for_llm()` - Convert snapshots to LLM-friendly text format
+- **Schema File: `sentience/schemas/trace_v1.json`** - JSON Schema for trace events, bundled with package
+
+#### Enhanced SentienceAgent
+- Added optional `tracer` parameter to `SentienceAgent.__init__()` for execution tracking
+- Added optional `config` parameter to `SentienceAgent.__init__()` for advanced configuration
+- Automatic tracing throughout `act()` method when tracer is provided
+- All tracing is **opt-in** - backward compatible with existing code
+
+### Changed
+- Bumped version from `0.11.0` to `0.12.0`
+- Updated `__init__.py` to export new modules: `AgentConfig`, `Tracer`, `TraceSink`, `JsonlTraceSink`, `TraceEvent`, and utility functions
+- Added `MANIFEST.in` to include JSON schema files in package distribution
+
+### Fixed
+- Fixed linting errors across multiple files:
+  - `sentience/cli.py` - Removed unused variable `code` (F841)
+  - `sentience/inspector.py` - Removed unused imports (F401)
+  - `tests/test_inspector.py` - Removed unused `pytest` import (F401)
+  - `tests/test_recorder.py` - Removed unused imports (F401)
+  - `tests/test_smart_selector.py` - Removed unused `pytest` import (F401)
+  - `tests/test_stealth.py` - Added `noqa` comments for intentional violations (E402, C901, F541)
+  - `tests/test_tracing.py` - Removed unused `TraceSink` import (F401)
+
+### Documentation
+- Updated `README.md` with comprehensive "Advanced Features" section covering tracing and utilities
+- Updated `docs/SDK_MANUAL.md` to v0.12.0 with new "Agent Tracing & Debugging" section
+- Added examples for:
+  - Basic tracing setup
+  - AgentConfig usage
+  - Snapshot digests for loop detection
+  - LLM prompt formatting
+  - Custom trace sinks
+
+### Testing
+- Added comprehensive test suites for new modules:
+  - `tests/test_tracing.py` - 10 tests for tracing infrastructure
+  - `tests/test_utils.py` - 22 tests for digest utilities
+  - `tests/test_formatting.py` - 9 tests for LLM formatting
+  - `tests/test_agent_config.py` - 9 tests for configuration
+- All 50 new tests passing ✅
+
+### Migration Guide
+
+#### For Existing Users
+No breaking changes! All new features are opt-in:
+
+```python
+# Old code continues to work exactly the same
+agent = SentienceAgent(browser, llm)
+agent.act("Click the button")
+
+# New optional features
+tracer = Tracer(run_id="run-123", sink=JsonlTraceSink("trace.jsonl"))
+config = AgentConfig(snapshot_limit=100, temperature=0.5)
+agent = SentienceAgent(browser, llm, tracer=tracer, config=config)
+agent.act("Click the button")  # Now traced!
+```
+
+#### Importing New Modules
+
+```python
+# Tracing
+from sentience.tracing import Tracer, JsonlTraceSink, TraceEvent, TraceSink
+
+# Configuration
+from sentience.agent_config import AgentConfig
+
+# Utilities
+from sentience.utils import (
+    compute_snapshot_digests,
+    canonical_snapshot_strict,
+    canonical_snapshot_loose,
+    sha256_digest
+)
+
+# Formatting
+from sentience.formatting import format_snapshot_for_llm
+```
+
+### Notes
+- This release focuses on developer experience and debugging capabilities
+- No changes to browser automation APIs
+- No changes to snapshot APIs
+- No changes to query/action APIs
+- Fully backward compatible with v0.11.0
+
+---
+
+## [0.11.0] - Previous Release
+
+(Previous changelog entries would go here)
diff --git a/README.md b/README.md
@@ -456,6 +456,86 @@ cd sentience-chrome
 - Check visibility: `element.in_viewport and not element.is_occluded`
 - Scroll to element: `browser.page.evaluate(f"window.sentience_registry[{element.id}].scrollIntoView()")`
 
+## Advanced Features (v0.12.0+)
+
+### Agent Tracing & Debugging
+
+The SDK now includes built-in tracing infrastructure for debugging and analyzing agent behavior:
+
+```python
+from sentience import SentienceBrowser, SentienceAgent
+from sentience.llm_provider import OpenAIProvider
+from sentience.tracing import Tracer, JsonlTraceSink
+from sentience.agent_config import AgentConfig
+
+# Create tracer to record agent execution
+tracer = Tracer(
+    run_id="my-agent-run-123",
+    sink=JsonlTraceSink("trace.jsonl")
+)
+
+# Configure agent behavior
+config = AgentConfig(
+    snapshot_limit=50,
+    temperature=0.0,
+    max_retries=1,
+    capture_screenshots=True
+)
+
+browser = SentienceBrowser()
+llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
+
+# Pass tracer and config to agent
+agent = SentienceAgent(browser, llm, tracer=tracer, config=config)
+
+with browser:
+    browser.page.goto("https://example.com")
+
+    # All actions are automatically traced
+    agent.act("Click the sign in button")
+    agent.act("Type 'user@example.com' into email field")
+
+# Trace events saved to trace.jsonl
+# Events: step_start, snapshot, llm_query, action, step_end, error
+```
+
+**Trace Events Captured:**
+- `step_start` - Agent begins executing a goal
+- `snapshot` - Page state captured
+- `llm_query` - LLM decision made (includes tokens, model, response)
+- `action` - Action executed (click, type, press)
+- `step_end` - Step completed successfully
+- `error` - Error occurred during execution
+
+**Use Cases:**
+- Debug why agent failed or got stuck
+- Analyze token usage and costs
+- Replay agent sessions
+- Train custom models from successful runs
+- Monitor production agents
+
+### Snapshot Utilities
+
+New utility functions for working with snapshots:
+
+```python
+from sentience import snapshot
+from sentience.utils import compute_snapshot_digests, canonical_snapshot_strict
+from sentience.formatting import format_snapshot_for_llm
+
+snap = snapshot(browser)
+
+# Compute snapshot fingerprints (detect page changes)
+digests = compute_snapshot_digests(snap.elements)
+print(f"Strict digest: {digests['strict']}")  # Changes when text changes
+print(f"Loose digest: {digests['loose']}")   # Only changes when layout changes
+
+# Format snapshot for LLM prompts
+llm_context = format_snapshot_for_llm(snap, limit=50)
+print(llm_context)
+# Output: [1] <button> "Sign In" {PRIMARY,CLICKABLE} @ (100,50) (Imp:10)
+```
+
 ## Documentation
 
 - **📖 [Amazon Shopping Guide](../docs/AMAZON_SHOPPING_GUIDE.md)** - Complete tutorial with real-world example
diff --git a/sentience/agent.py b/sentience/agent.py
@@ -5,7 +5,7 @@
 
 import re
 import time
-from typing import Any, Dict, List, Optional, Union
+from typing import TYPE_CHECKING, Any, Dict, List, Optional, Union
 
 from .actions import click, press, type_text
 from .base_agent import BaseAgent
@@ -23,6 +23,10 @@
 )
 from .snapshot import snapshot
 
+if TYPE_CHECKING:
+    from .agent_config import AgentConfig
+    from .tracing import Tracer
+
 
 class SentienceAgent(BaseAgent):
     """
@@ -54,6 +58,8 @@ def __init__(
         llm: LLMProvider,
         default_snapshot_limit: int = 50,
         verbose: bool = True,
+        tracer: Optional["Tracer"] = None,
+        config: Optional["AgentConfig"] = None,
     ):
         """
         Initialize Sentience Agent
@@ -63,11 +69,15 @@ def __init__(
             llm: LLM provider (OpenAIProvider, AnthropicProvider, etc.)
             default_snapshot_limit: Default maximum elements to include in context (default: 50)
             verbose: Print execution logs (default: True)
+            tracer: Optional Tracer instance for execution tracking (default: None)
+            config: Optional AgentConfig for advanced configuration (default: None)
         """
         self.browser = browser
         self.llm = llm
         self.default_snapshot_limit = default_snapshot_limit
         self.verbose = verbose
+        self.tracer = tracer
+        self.config = config
 
         # Execution history
         self.history: list[dict[str, Any]] = []
@@ -80,6 +90,9 @@ def __init__(
             "by_action": [],
         }
 
+        # Step counter for tracing
+        self._step_count = 0
+
     def act(
         self, goal: str, max_retries: int = 2, snapshot_options: SnapshotOptions | None = None
     ) -> AgentActionResult:
@@ -107,6 +120,21 @@ def act(
             print(f"🤖 Agent Goal: {goal}")
             print(f"{'='*70}")
 
+        # Generate step ID for tracing
+        self._step_count += 1
+        step_id = f"step-{self._step_count}"
+
+        # Emit step_start trace event if tracer is enabled
+        if self.tracer:
+            pre_url = self.browser.page.url if self.browser.page else None
+            self.tracer.emit_step_start(
+                step_id=step_id,
+                step_index=self._step_count,
+                goal=goal,
+                attempt=0,
+                pre_url=pre_url,
+            )
+
         for attempt in range(max_retries + 1):
             try:
                 # 1. OBSERVE: Get refined semantic snapshot
@@ -135,6 +163,18 @@ def act(
                 if snap.status != "success":
                     raise RuntimeError(f"Snapshot failed: {snap.error}")
 
+                # Emit snapshot trace event if tracer is enabled
+                if self.tracer:
+                    self.tracer.emit(
+                        "snapshot",
+                        {
+                            "url": snap.url,
+                            "element_count": len(snap.elements),
+                            "timestamp": snap.timestamp,
+                        },
+                        step_id=step_id,
+                    )
+
                 # Apply element filtering based on goal
                 filtered_elements = self.filter_elements(snap, goal)
 
@@ -156,6 +196,19 @@ def act(
                 # 3. THINK: Query LLM for next action
                 llm_response = self._query_llm(context, goal)
 
+                # Emit LLM query trace event if tracer is enabled
+                if self.tracer:
+                    self.tracer.emit(
+                        "llm_query",
+                        {
+                            "prompt_tokens": llm_response.prompt_tokens,
+                            "completion_tokens": llm_response.completion_tokens,
+                            "model": llm_response.model,
+                            "response": llm_response.content[:200],  # Truncate for brevity
+                        },
+                        step_id=step_id,
+                    )
+
                 if self.verbose:
                     print(f"🧠 LLM Decision: {llm_response.content}")
 
@@ -186,6 +239,22 @@ def act(
                     message=result_dict.get("message"),
                 )
 
+                # Emit action execution trace event if tracer is enabled
+                if self.tracer:
+                    post_url = self.browser.page.url if self.browser.page else None
+                    self.tracer.emit(
+                        "action",
+                        {
+                            "action": result.action,
+                            "element_id": result.element_id,
+                            "success": result.success,
+                            "outcome": result.outcome,
+                            "duration_ms": duration_ms,
+                            "post_url": post_url,
+                        },
+                        step_id=step_id,
+                    )
+
                 # 5. RECORD: Track history
                 self.history.append(
                     {
@@ -202,9 +271,25 @@ def act(
                     status = "✅" if result.success else "❌"
                     print(f"{status} Completed in {duration_ms}ms")
 
+                # Emit step completion trace event if tracer is enabled
+                if self.tracer:
+                    self.tracer.emit(
+                        "step_end",
+                        {
+                            "success": result.success,
+                            "duration_ms": duration_ms,
+                            "action": result.action,
+                        },
+                        step_id=step_id,
+                    )
+
                 return result
 
             except Exception as e:
+                # Emit error trace event if tracer is enabled
+                if self.tracer:
+                    self.tracer.emit_error(step_id=step_id, error=str(e), attempt=attempt)
+
                 if attempt < max_retries:
                     if self.verbose:
                         print(f"⚠️  Retry {attempt + 1}/{max_retries}: {e}")