Skip to content

Conversation

@rcholic
Copy link

@rcholic rcholic commented Jan 9, 2026

Snapshot Elements Are Formatted For LLM Agent Reasoning

Element ID Role Name Importance doc_y ordinal dominant group href
783 textbox 1014 6 - 0
16 link Hacker News 307 0 7 1 ycombinator
454 link Show HN: InfiniteGPU, An op... 192 3 230 1 github
550 link Show HN: ElixirBrowser – An... 189 4 282 1 github
289 link Show HN: Miditui – A termin... 188 2 141 1 github
313 link Show HN: EuConform – Offlin... 184 2 154 1 github
622 link Show HN: Roleplay-first cha... 184 5 320 1 abliteration
526 link Show HN: I built a Postgres... 183 4 269 1 postgresgui
646 link Show HN: I vibecoded an ARM... 183 5 333 1 github
742 link Show HN: SMTP Tunnel – A SO... 181 6 385 1 github
430 link Show HN: Repogen – a static... 178 3 217 1 github
406 link Show HN: macOS menu bar app... 177 3 205 1 github
217 link Show HN: Similarity = cosin... 177 2 104 1 github
478 link Show HN : A game to documen... 175 4 243 1 jeevan
241 link Show HN: I made a memory ga... 175 2 116 1 specr
718 link Show HN: A geofence-based s... 175 5 372 1 localvideoapp
169 link Show HN: Why Nvidia's PhysX... 175 1 78 1 github
598 link Show HN: I visualized the e... 174 4 308 1 bikemap
502 link Show HN: I built a tool to ... 174 4 256 1 tryflux
49* (top) link Show HN: 15 Years of StarCr... 173 0 15 1 migdal
121 link Show HN: Agent-of-empires: ... 173 1 53 1 github
670 link Show HN: I built a "Do not ... 168 5 346 1 apoorv
574 link Show HN: DeepDream for Vide... 168 4 295 1 github
73 link Show HN: Yuanzai World – LL... 165 1 28 1 yuanzai
INFO     [state_injector] ✅ Sentience snapshot: 50 elements, URL: https://news.ycombinator.com/show
INFO     [Agent] 🧠 Sentience: Injected 50 semantic elements (1557 chars) into LLM context
INFO     [service] ✅ Vision is DISABLED: use_vision=False, screenshots=0, sample_images=0, read_state_images=0
INFO     [service] ✅ Sentience detected - reducing DOM size to 5000 chars
INFO     [prompts] 📊 DOM state: 10412 chars (~2603 tokens) truncated to 5000 chars (~1250 tokens)
INFO     [prompts] 📊 DOM state: 10412 chars (~2603 tokens) truncated to 5000 chars (~1250 tokens)
INFO     [prompts] 📊 Token breakdown (chars): agent_history=680 (~170 tokens), browser_state=5414 (~1353 tokens), agent_state=228 (~57 tokens), read_state=0 (~0 tokens), total=6418 (~1604 tokens)
INFO     [prompts] ✅ AgentMessagePrompt: Vision DISABLED - use_vision=False, screenshots=0, sample_images=0, read_state_images=0
INFO     [Agent]   🧠 Memory: I have successfully navigated to the "Show HN" page. The goal is to find the number 1 post. Looking at the browser state, the post ranked 1 is clearly visible. The title is "Show HN: 15 Years of StarCraft II Balance Changes Visualized Interactively". I will extract this title and use the 'done' action to complete the task.
INFO     [Agent]   ▶️   done: text: The number 1 post on Show HN is: Show HN: 15 Years of StarCraft II Balance Changes Visualized Interactively, success: True

📄 Final Result:
The number 1 post on Show HN is: Show HN: 15 Years of StarCraft II Balance Changes Visualized Interactively

INFO     [Agent] ✅ Task completed successfully
INFO     [BrowserSession] 📢 on_BrowserStopEvent - Calling reset() (force=True, keep_alive=None)
INFO     [BrowserSession] [SessionManager] Cleared all owned data (targets, sessions, mappings)
INFO     [BrowserSession] ✅ Browser session reset complete
INFO     [BrowserSession] ✅ Browser session reset complete

Token Usage Comparison:

Before (i.e. Use Vision Model & Full DOM)

TLDR;

DOM limit: 40,000 chars;
DOM state: 10,195 chars (~2548 tokens) - not truncated;
Total per step: ~12,667 chars (~3,166 tokens)

Overall Token Usage Summary

Metric Value
Total Prompt Tokens 35,740
Total Prompt Cost $0.006997
Total Prompt Cached Tokens 839
Total Prompt Cached Cost $0.0000168
Total Completion Tokens 1,311
Total Completion Cost $0.002622
Total Tokens 37,051
Total Cost $0.009636
Entry Count 7

Per-Model Breakdown (bu-1-0)

Metric Value
Model bu-1-0
Prompt Tokens 35,740
Completion Tokens 1,311
Total Tokens 37,051
Cost $0.009619
Invocations 7
Average Tokens per Invocation 5,293.0

Cost Breakdown

Category Tokens Cost
Prompt (uncached) 34,901 $0.006997
Prompt (cached) 839 $0.0000168
Completion 1,311 $0.002622
Total 37,051 $0.009636

After (i.e. Use SentienceAPI SDK with ranked DOM)

TLDR;

DOM limit: 5,000 chars;
DOM state: 5,000 chars (~1,250 tokens) - saved 1,353 tokens;
Total per step: ~6,418 (~1,604 tokens) - 50% reduction

Overall Token Usage Summary

Metric Value
Total Prompt Tokens 13,251
Total Prompt Cost $0.0025026
Total Prompt Cached Tokens 820
Total Prompt Cached Cost $0.0000164
Total Completion Tokens 892
Total Completion Cost $0.001784
Total Tokens 14,143
Total Cost $0.004303
Entry Count 4

Per-Model Breakdown (bu-1-0)

Metric Value
Model bu-1-0
Prompt Tokens 13,251
Completion Tokens 892
Total Tokens 14,143
Cost $0.004287
Invocations 4
Average Tokens per Invocation 3,535.75

Cost Breakdown

Category Tokens Cost
Prompt (uncached) 12,431 $0.0025026
Prompt (cached) 820 $0.0000164
Completion 892 $0.001784
Total 14,143 $0.004303

Overlay During Agent Runtime

Screenshot 2026-01-08 at 8 52 03 PM

Improvements with SentienceAPI SDK

  1. Vision disabled — no screenshots/images sent to LLM
  2. DOM truncation working — 10,412 chars → 5,000 chars (saves ~1,353 tokens per step)
  3. Task completed successfully — agent found the correct top post
  4. Token usage is roughly halved per step while maintaining accuracy. The agent is using Sentience semantic geometry instead of full DOM + screenshots, which is more efficient & cost-effective.

@rcholic rcholic changed the title initial sentienceapi experiment SentienceAPI Snapshot Experiment with browser-use Agent Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants