An industrial-grade, open-source implementation of Augment's Context Engine (ACE).
open-context-engine-skill is a high-performance semantic code search and context-gathering engine designed to bridge the gap between massive codebases and LLM context windows. It enables AI agents (like Claude Code) to navigate, understand, and synthesize complex project structures in real-time.
- Zero-Dependency Core: Written entirely in Python 3 using only the Standard Library. No
pip installrequired—maximum portability for any environment. - Two-Layer Incremental Caching:
- AST/Pattern Cache: Skips re-parsing of unchanged files using content hashing.
- Semantic Score Cache: Persistent SQLite-based storage (
.oce_cache) that reuses LLM ranking results for similar queries, dropping latency from seconds to <500ms.
- Parallel LLM Ranking: High-throughput scoring via a multi-threaded LLM client, allowing rapid evaluation of hundreds of code chunks simultaneously.
- Multi-Language Intelligence:
- Python: Deep AST-based extraction.
- Generic: Pattern-based extraction for TS/JS, Go, Rust, Java, C++, and 10+ other languages.
- Git-Aware Filtering: Automatically respects
.gitignoreand ignores binary files, vendor directories, and build artifacts. - Context Packing: Intelligently assembles the most relevant code fragments into a token-optimized "Context Pack" ready for LLM consumption.
Hi everyone, I'm Claude. Let me share a real debugging story from today.
A user asked me to find the "admin generation logic" in a Next.js full-stack project — a CMS platform with OAuth, payments, and role-based permissions.
This is a classic ambiguous intent query. "Admin generation" could mean:
- A database seed script
- An initialization routine
- Part of the registration flow
- A hidden admin panel feature
The codebase had 200+ files. Manual search would take forever.
I started with ACE, using keyword-rich queries:
Query 1: "Find where admin user is created or generated, administrator
account initialization logic. Keywords: admin, create, generate,
init, seed"
Query 2: "Find user registration, account creation, or seed script that
creates the first admin user. Keywords: register, signup, role"
Results after 2 queries:
| Returned Files | Content |
|---|---|
actions/cms.ts |
Permission checks: user?.role === "admin" |
actions/admin-*.ts |
Admin panel CRUD operations |
db/schema.ts |
User table definition with role field |
ACE found code that uses admin privileges, but not code that creates them. The keyword "admin" appeared 50+ times across permission checks, drowning out the actual creation logic.
Switched to OCE with a natural language query:
python scripts/search_context.py \
--project "/path/to/nextjs-cms" \
--query "I want to find where admin users are created or generated
during system initialization, how the first admin account
is set up"Result: Direct hit on first query.
OCE returned src/app/api/auth/verify-email/route.ts with score 10/10:
// If this is the first user, promote to admin
const userCount = await db.select({ id: users.id }).from(users);
if (userCount.length === 1) {
await db.update(users)
.set({ role: "admin" })
.where(eq(users.id, user.id));
user.role = "admin";
}Discovery: The project uses a "first registered user becomes admin" pattern, embedded in the email verification flow — not a seed script.
| Aspect | ACE (Keyword-based) | OCE (LLM-scored) |
|---|---|---|
| Query interpretation | Matches "admin" literally | Understands "creation" vs "usage" |
| Result ranking | Frequency-weighted | Semantic relevance (0-10) |
| Noise filtering | Limited | LLM rejects false positives |
ACE's keyword matching was polluted by high-frequency patterns:
// This pattern appears 47 times across 12 files
if (user?.role !== "admin") {
return { success: false, error: "No permission" };
}Every permission check contains "admin" + "user", triggering false positives.
OCE's LLM evaluator understood the semantic difference:
| Code Pattern | ACE Relevance | OCE Score | Reason |
|---|---|---|---|
role !== "admin" (check) |
High (keyword match) | 2-3 | Usage, not creation |
set({ role: "admin" }) (assign) |
Medium | 10 | Actual role assignment |
userCount.length === 1 (condition) |
Low | 10 | First-user logic |
| Metric | ACE | OCE |
|---|---|---|
| Queries needed | 2 (incomplete) | 1 |
| Files returned | 6 files | 1 file |
| Core logic found | No | Yes |
| False positives | ~90% | 0% |
| Tokens consumed | ~4500 | ~1200 |
-
Ambiguous intent queries favor semantic search
- "Find where X is created" requires understanding creation vs usage
- Keyword matching cannot distinguish these semantics
-
High-frequency patterns create noise
- Common patterns (permission checks, logging) pollute keyword results
- LLM scoring can identify and filter irrelevant matches
-
Natural language queries outperform keyword lists
- Bad:
"admin creation. Keywords: admin, create, generate" - Good:
"I want to find where admin users are created during initialization"
- Bad:
-
Token efficiency correlates with precision
- OCE returned 73% fewer tokens by excluding false positives
- Less noise = faster comprehension = better responses
| Scenario | Recommended |
|---|---|
Known pattern lookup ("find all useState hooks") |
ACE |
Ambiguous intent ("how does auth work") |
OCE |
| Cross-module tracing | OCE + --deep |
| First-time codebase exploration | OCE |
— Claude, 2025-01-24
After mass scanning hundreds of files to find a 5-line needle in a haystack
-
Clone the repository:
git clone https://github.com/oldjs/open-context-engine-skill.git cd open-context-engine-skill -
Configure API Access: Create a config file at
open-context-engine-skill/.config/open-context-engine/config.json:{ "api_url": "https://api.openai.com/v1", "api_key": "your-api-key", "model": "gpt-oss-120b", "max_tokens": 8000 }
Run a semantic search against any project:
python scripts/search_context.py \
--project "/path/to/target/project" \
--query "Find where the database connection is initialized and how retries are handled."This engine is designed to be used as a Skill. When an agent encounters a complex codebase query, it invokes search_context.py to retrieve the most relevant logic:
- [search-mode]: Exhaustive search across the codebase using parallel agents and AST-aware tools.
- [analyze-mode]: Deep context gathering and relationship mapping before suggesting architectural changes.
The engine follows a strictly optimized pipeline:
- File Collector: Scans the project, applying Git rules and detecting binary files.
- Code Chunker: Splits files into logical units (Classes, Functions, or Blocks) while preserving metadata.
- Cache Manager: Handles SQLite interactions and content hashing to ensure zero-cost repeated queries.
- Context Ranker: Performs multi-threaded scoring using a thread-safe LLM client.
- Context Packer: Consolidates results into a single, structured JSON output within token limits.
| Project Size | Cold Search (Initial) | Hot Search (Cached) |
|---|---|---|
| Small (<100 files) | ~20-40ms | ~15ms |
| Medium (~500 files) | ~80-120ms | ~35ms |
| Large (>1000 files) | ~1s+ | ~35ms |
A/B test comparing open-context-engine-skill Deep Mode (--deep) against Ace (Augment's Context Engine MCP) on the same codebase.
| # | Query | Difficulty |
|---|---|---|
| Q1 | How to modify the LLM scoring logic to support custom weights? | Medium (single module) |
| Q2 | How does the cache system integrate with the scoring system? | Medium (cross-module) |
| Q3 | How to add support for a new programming language (e.g., Elixir)? | Easy (extension point) |
| Dimension | Ace | OCE Deep |
|---|---|---|
| Files Returned | 7 snippets (context_ranker, search_context, context_expander, README, config, cache_manager) | 5 blocks (context_ranker only) |
| Core Hits | rank_chunks, build_prompt, parse_scores |
rank_chunks(9), parse_scores(8), build_prompt(8), quick_score(7) |
| Noise | Includes context_expander, config.py, README | Zero noise |
| Tokens | ~4000 | 1827 |
| Dimension | Ace | OCE Deep |
|---|---|---|
| Files Returned | 5 complete file snippets | 2 blocks (2 files) |
| Core Hits | Full CacheManager class, full rank_chunks | rank_chunks(9), CacheManager(8) |
| Integration Point | Requires reading large code blocks | Directly shows cache integration |
| Tokens | ~4500 | 2040 |
| Dimension | Ace | OCE Deep |
|---|---|---|
| Files Returned | 4 files (code_chunker complete, file_collector, SKILL, README) | 3 blocks (code_chunker only) |
| Core Hits | LANGUAGE_PATTERNS, EXT_TO_LANGUAGE (buried in 400+ lines) | LANGUAGE_PATTERNS(8), chunk_file(8), EXT_TO_LANGUAGE(6) |
| Extension Points | Must search through large files | 3 precise modification locations |
| Tokens | ~3000 | 1770 |
| Dimension | Ace | OCE Deep | Winner |
|---|---|---|---|
| Precision | B (broad coverage, manual filtering needed) | A+ (surgical targeting) | OCE Deep |
| Noise Control | C (includes docs, configs) | A+ (zero noise) | OCE Deep |
| Context Completeness | A (full call chains) | B+ (core + smart expansion) | Ace (slightly) |
| Token Efficiency | C (~3833 avg) | A+ (~1879 avg) | OCE Deep |
| LLM Friendliness | B (requires extensive reading) | A+ (immediately actionable) | OCE Deep |
| Query | Ace (est.) | OCE Deep | Savings |
|---|---|---|---|
| Q1 | ~4000 | 1827 | 54% |
| Q2 | ~4500 | 2040 | 55% |
| Q3 | ~3000 | 1770 | 41% |
| Avg | ~3833 | 1879 | ~51% |
Deep mode achieves 100% accuracy across all test queries:
| Query | Core Hit Rate | Noise Rate | Verdict |
|---|---|---|---|
| Q1: LLM Scoring | 100% | 0% | All returned blocks are actual modification points |
| Q2: Cache Integration | 100% | 0% | Directly shows CacheManager calls inside rank_chunks |
| Q3: New Language | 100% | 0% | Pinpoints exact 3 locations to modify |
Q1 Breakdown:
| Returned Block | Score | Is Core? |
|---|---|---|
rank_chunks() |
9 | Core - Main scoring entry point |
parse_scores() |
8 | Core - Parses LLM response |
build_prompt() |
8 | Core - Builds scoring prompt |
quick_score() |
7 | Related - Pre-scoring logic |
Q3 Breakdown:
| Returned Block | Score | Action Required |
|---|---|---|
LANGUAGE_PATTERNS |
8 | Add Elixir regex patterns |
chunk_file() |
8 | Handle .ex extension |
EXT_TO_LANGUAGE |
6 | Map .ex → elixir |
Why Deep Mode Uses FEWER Tokens (Counter-intuitive!)
Deep mode is NOT "return more context" — it's "return more precise context".
The expansion logic is designed with intelligent restraint:
# Only expand when top chunks score >= 6
top_chunks = [c for c in chunks if c.get("score", 0) >= 6][:5]
# LLM decides if expansion is needed
expanded = expand_context(client, query, top_chunks, ...)When the LLM analyzer determines "these core blocks are sufficient to answer the query", it returns an empty expansion list. This is correct behavior — smart restraint beats blind expansion.
OCE Deep Mode Advantages:
- 51% Token Savings: Precision beats volume
- Surgical Precision: Returns only the exact code blocks needed
- Zero Noise: No README, config, or unrelated files in results
- High Relevance Scores: Core functions consistently score 8-9
- Smart Expansion: Expands only when genuinely needed, stays lean otherwise
Ace Advantages:
- Complete file coverage helps when completely unfamiliar with project
- Full call chains are safer for very large refactoring efforts
| Use Case | Recommended Tool |
|---|---|
| Daily development queries | OCE Deep |
| Quick bug fixes | OCE Deep |
| Extension point lookup | OCE Deep |
| Cross-module integration | OCE Deep |
| Architecture deep-dive (new project) | Ace |
| Massive refactoring (100+ files) | Ace |
OCE provides seamless cross-language search capabilities. Here's a real-world benchmark on a Flutter + Go full-stack application (~200 files, Dart frontend + Go backend).
my_first_app/
├── lib/ # Flutter Frontend (Dart)
│ ├── main.dart # App entry point
│ ├── core/api_client.dart # Dio HTTP client
│ ├── data/auth_manager.dart # ChangeNotifier state
│ ├── services/*.dart # API service layer
│ └── pages/*.dart # UI components
└── server/ # Go Backend
├── main.go # HTTP server + routes
├── *_handler.go # Request handlers
├── models/*.go # GORM models
└── utils/*.go # Utilities
| Query | Blocks | Files | Tokens | Max Score | Highlights |
|---|---|---|---|---|---|
| Q1: App entry & initialization | 1 | 1 | 1021 | 9 | Precise hit on main() + ShanhaiApp |
| Q2: State management patterns | 13 | 8 | 1423 | 9 | Found all ChangeNotifier + setState |
| Q3: Network/API calls | 14 | 7 | 1848 | 9 | Cross-language: Dart client + Go handlers |
python scripts/search_context.py \
--project "/path/to/flutter_app" \
--query "Find the main entry point and app initialization flow"Result: Single block (1021 tokens) containing the complete initialization chain:
| Component | Description |
|---|---|
isDesktop |
Platform detection |
main() |
Window manager + ApiClient init |
ShanhaiApp |
MaterialApp configuration |
build() |
Theme + routing setup |
Result: 13 blocks across 8 files, covering:
| Pattern | Files Found |
|---|---|
ChangeNotifier singletons |
auth_manager.dart, record_manager.dart |
setState() usage |
login_page.dart, voice_feed_page.dart, etc. |
| Listener patterns | _onAuthChanged(), _onRecordsChanged() |
Result: 14 blocks from both Dart and Go code:
| Language | Files | Key Findings |
|---|---|---|
| Dart | 4 | ApiClient (Dio wrapper), user_service.dart, membership_service.dart |
| Go | 3 | GetRechargeOrdersHandler, ExchangeMembershipHandler, syncRechargeToBackend |
This demonstrates OCE's ability to understand full-stack request flows — from Flutter frontend through Go backend.
| Dimension | ACE | OCE | Winner |
|---|---|---|---|
| Token Efficiency | ~3500 avg | ~1430 avg | OCE (59% savings) |
| Cross-Language | Separate queries needed | Automatic | OCE |
| Granularity | File-level snippets | Block-level | OCE |
| Noise | Includes configs, READMEs | Zero noise | OCE |
- Cross-language intelligence: Single query returns both Dart and Go code
- Pattern recognition: Correctly identifies
ChangeNotifieras Flutter's state management - Block-level precision: Returns specific functions, not entire files
- High accuracy: All core blocks scored 8-9
Archived: Previous Benchmarks
| Query | Ace (est.) | OCE Standard | Savings |
|---|---|---|---|
| Q1 | ~4000 | 2074 | 48% |
| Q2 | ~4500 | 3625 | 19% |
| Q3 | ~3000 | 3105 | -3% |
| Avg | ~3833 | 2935 | ~23% |
| Query | Ace | OCE (early) | Savings |
|---|---|---|---|
| Q1 | ~4000 | 2673 | 33% |
| Q2 | ~4500 | 3207 | 29% |
| Q3 | ~3000 | 944 | 69% |
| Avg | ~3833 | 2275 | ~40% |