UI5
diff --git a/‎plugins/ui5-guidelines/.env.example‎
Lines changed: 17 additions & 0 deletions b/‎plugins/ui5-guidelines/.env.example‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎plugins/ui5-guidelines/.gitignore‎
Lines changed: 6 additions & 0 deletions b/‎plugins/ui5-guidelines/.gitignore‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎plugins/ui5-guidelines/README.md‎
Lines changed: 65 additions & 19 deletions b/‎plugins/ui5-guidelines/README.md‎
Lines changed: 65 additions & 19 deletions
@@ -0,0 +1,17 @@
+# Integration Test Configuration
+
+# Required for integration tests
+ANTHROPIC_API_KEY=sk-ant-...
+
+# Optional: Override default model (default: claude-sonnet-4-20250514)
+CLAUDE_MODEL=claude-sonnet-4-20250514
+
+# Optional: Override default model for specific providers
+ANTHROPIC_API_MODEL=claude-sonnet-4-20250514
+CLAUDE_CODE_MODEL=
+
+# Optional: Set max cost budget (in USD, default: no limit)
+MAX_COST_BUDGET=1.00
+
+# Optional: Set test timeout (in ms, default: 30000)
+TEST_TIMEOUT=30000
@@ -7,6 +7,12 @@
 *.metrics.json
 test/coverage/
 test/.tmp/
+.benchmarks/
+
+# Environment files
+.env
+.env.local
+.env.*.local
 
 # TypeScript build output
 dist/
 
@@ -6,6 +6,8 @@ Version-aware skill covering modern UI5 coding standards and architectural patte
 
 **Status**: ✅ Production Ready | ✅ Unit tests passing | ⚠️ Integration tests available
 
+**Status**: ✅ Production Ready | ✅ 70 unit tests passing | ⚠️ Integration tests available (11 enhancements pending)
+
 ---
 
 ## Features
@@ -255,47 +257,91 @@ npm run metrics:optimize     # Optimization tips
 
 ## Testing
 
-For contributors and developers: This branch includes a comprehensive test suite.
+The UI5 Guidelines plugin has a **three-level testing approach**. See [TESTING.md](TESTING.md) for complete documentation.
+
+### Test Levels
+
+**Level 1: Unit Tests** (Structure & Performance) ✅  
+- 15 structure tests, 8 performance tests
+- Validates plugin configuration and token budgets
+- Fast, deterministic, no API calls
+
+**Level 2: Proxy Tests** (Triggering Simulation) ⚠️  
+- 47 triggering tests with simulated keyword matching
+- **Important**: These do NOT test real Claude behavior
+- Use for development feedback and keyword coverage
+
+**Level 3: Integration Tests** (Live API) 🔬  
+- 47 test cases per provider (Anthropic API, Claude Code CLI)
+- Tests actual Claude model behavior
+- Multi-provider support with cost tracking
+- **Status**: 6 critical bugs fixed, 11 enhancements pending
 
 ### Quick Test
 
 ```bash
 cd plugins/ui5-guidelines
 npm install
+npm run build
+
+# Run unit tests (Level 1 & 2) - Free, fast
 npm test
+
+# Run integration tests (Level 3) - Requires API key
+export ANTHROPIC_API_KEY="sk-ant-..."
+npm run test:integration:api          # Anthropic API (~$0.40-0.80)
+npm run test:integration:claude       # Claude Code CLI (free)
+npm run test:integration:cross        # Cross-provider consistency
 ```
 
-**Expected output:**
+**Expected output (unit tests):**
 ```
-✅ Structure: 12/12 passing (100%)
-✅ Triggering: 46/46 passing (100%)
-✅ Performance: 7/7 passing
+✅ Structure: 16/16 passing (100%)
+⚠️  Triggering: 46/46 passing (97.8% - simulation only)
+✅ Performance: 7/7 passing (100%)
 ```
 
-### Test Suites
-
-- **Structure Tests:** Validate plugin file organization and completeness
-- **Triggering Tests:** Ensure skills activate correctly (46 test cases)
-- **Performance Tests:** Verify context budget efficiency
-
 ### Run Specific Tests
 
 ```bash
-npm run test:structure      # Structure validation
-npm run test:triggering     # Triggering accuracy
-npm run test:performance    # Context budget
+# Unit tests (fast, no cost)
+npm run test:structure      # Plugin structure validation
+npm run test:triggering     # Keyword coverage (simulation)
+npm run test:performance    # Context budget checks
+
+# Integration tests (slow, costs money)
+npm run test:integration           # All providers
+npm run test:integration:api       # Anthropic API only
+npm run test:integration:claude    # Claude Code CLI only
+
+# Watch mode (development)
+npm run test:watch  # Auto-rerun on changes
 ```
 
+### Understanding Test Results
+
+**⚠️ Important**: Proxy test results (97.8%) show keyword coverage, NOT real Claude behavior.
+
+For real-world accuracy, see integration test results:
+- Target: >90% accuracy with real Claude API
+- Cost: ~$0.40-0.80 per full test run
+- Run: Daily schedule or before releases
+
 ### View Metrics
 
 ```bash
-npm run metrics            # All-time metrics
-npm run metrics:week       # Last 7 days
-npm run metrics:month      # Last 30 days
-npm run metrics:optimize   # Optimization tips
+npm run metrics              # Last 7 days
+npm run metrics:week         # Last 7 days
+npm run metrics:month        # Last 30 days
+npm run metrics:optimize     # Optimization tips
 ```
 
-**See [TESTING.md](TESTING.md) for detailed testing documentation.**
+### Documentation
+
+- **[TESTING.md](TESTING.md)** - Complete testing guide
+- **[TESTING_LIMITATIONS.md](TESTING_LIMITATIONS.md)** - Why proxy tests ≠ real tests
+- **[TESTING_ROADMAP.md](TESTING_ROADMAP.md)** - Future enhancements
+- **[PLAN.md](PLAN.md)** - Testing framework implementation plan
 
 ---