-
Notifications
You must be signed in to change notification settings - Fork 18
Add intelligent audit logging and oversized error logging systems #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add intelligent audit logging and oversized error logging systems #20
Conversation
src/config/index.js
Outdated
| } | ||
|
|
||
| const SUPPORTED_MODEL_PROVIDERS = new Set(["databricks", "azure-anthropic", "ollama", "openrouter", "azure-openai", "openai", "llamacpp", "lmstudio", "bedrock", "zai", "vertex"]); | ||
| const SUPPORTED_MODEL_PROVIDERS = new Set(["databricks", "azure-anthropic", "ollama", "openrouter", "azure-openai", "openai", "llamacpp", "lmstudio", "bedrock"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you take the latest pull
We have recently added support for z.ai and vertex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for embedding models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you take the latest pull We have recently added support for z.ai and vertex
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All issues resolved.
I added another commit to the PR as it fits perfectly here. With these changes in your repo I'd be up-to-date and would continue checking/enhancing/fixing Lynkr.
Logs all LLM requests/responses including model, hostname, IP, timestamps, and token counts. First occurrence of content is logged in full with SHA256 hash, subsequent occurrences reference the hash. Full content stored in separate dictionary file for lookup. Keeps log files small. - LLM_AUDIT_ENABLED defaults to false (opt-in, inactive by default) - Deduplication enabled by default when logging is active - Includes reader utility and test suite New files: - src/logger/deduplicator.js - src/logger/audit-logger.js - scripts/audit-log-reader.js - scripts/test-deduplication.js Modified: - src/config/index.js: LLM_AUDIT_ENABLED defaults to false Test results: 48% space savings
3bf3362 to
9afe2f3
Compare
Implements intelligent content cleaning that removes wasteful empty "User:" entries from LLM audit log content before storage in the deduplication dictionary. Empty User: entries appear in conversation logs and serve no purpose and bloating the log dictionary. Root cause for empty "User:" entries outstanding - should be investigated to reduce tokens wasted in requests. Changes: - Add _sanitizeContent() method to ContentDeduplicator that detects and removes empty "User:" entries while preserving non-empty ones - Integrate sanitization into storeContent() to clean content before hashing - Add LLM_AUDIT_DEDUP_SANITIZE config option (default: true) - Pass sanitize option through config chain to deduplicator - Add comprehensive unit tests for sanitization behavior Benefits: - Reduces stored content size by ~20% for affected entries - Saves 10-15 tokens per empty User: entry removed - Cleaner, more semantically meaningful dictionary content - Can be disabled via LLM_AUDIT_DEDUP_SANITIZE=false if needed Tests: All 6 deduplication tests pass, including 2 new sanitization tests
This commit implements 6 major phases of audit logging improvements:
**Phase 1: Hash-Before-Truncate Strategy**
- Hash original content BEFORE truncation to preserve full content tracking
- Add hashAndTruncate() and hashAndTruncateSystemReminder() functions
- Add storeContentWithHash() method to deduplicator
- Ensures dictionary stores hashes of original (not truncated) content
**Phase 2: Session-Level Hash Tracking**
- Add sessionContentCache to track hashes seen in current session
- First occurrence in session logs full (truncated) content
- Subsequent occurrences log only hash references
- Add isFirstTimeInSession() and markSeenInSession() methods
- Significantly reduces log file size within a session
**Phase 3: Hash Annotation Lines**
- Add logHashAnnotation() function to output annotation lines
- Annotations include hash values and lookup instructions
- Format: {"_annotation": true, "systemPromptHash": "...", "lookup": "..."}
- Makes it easy to find and decode hash references
**Phase 4: Aggressive Truncation Limits**
- Change maxContentLength from single value to object with type-specific limits
- systemPrompt: 2000 chars (down from 5000)
- userMessages: 3000 chars
- response: 3000 chars
- Add environment variables: LLM_AUDIT_MAX_SYSTEM_LENGTH, etc.
- Expected 60-70% log size reduction
**Phase 5: Enhanced Loop Detection Logging**
- Add more structured logging when loop detected (count === 3)
- At termination (count > 3), log FULL context for debugging:
- myPrompt: Full conversation sent to LLM
- systemPrompt: Full system prompt
- llmResponse: Full LLM response
- repeatedToolCalls: The actual repeated tool calls
- toolCallHistory: Full history of all tool calls
- Add correlationId, action, totalSteps metadata
- Critical for debugging why loops occur
**Phase 6: Dictionary Cleanup Script**
- Create scripts/compact-dictionary.js
- Removes redundant UPDATE entries from dictionary
- Keeps only latest metadata with full content per hash
- Supports --dry-run, --backup, --no-backup options
- Reports statistics on size reduction
Configuration changes:
- Add LLM_AUDIT_MAX_SYSTEM_LENGTH (default: 2000)
- Add LLM_AUDIT_MAX_USER_LENGTH (default: 3000)
- Add LLM_AUDIT_MAX_RESPONSE_LENGTH (default: 3000)
- Add LLM_AUDIT_ANNOTATIONS (default: true)
- Add LLM_AUDIT_DEDUP_SESSION_CACHE (default: true)
Expected Impact:
- Log file size: 60-70% reduction
- Readability: Significantly improved
- Debugging: Much easier with hash annotations
- Loop visibility: Full context captured for analysis
9afe2f3 to
1fa9cab
Compare
- Add custom Pino stream to capture errors with fields > 200 characters - Store oversized errors in session-based log files (JSONL format) - One file per session, all oversized errors from same session append to single file - Errors captured at WARN and ERROR levels only - Full error context preserved without truncation - Configuration options for threshold, max files, and log directory - Session ID added to logger context in session middleware - Graceful degradation if capture fails (main logging continues) Files changed: - src/logger/oversized-error-stream.js: Core stream implementation - src/logger/index.js: Multistream setup for dual logging - src/config/index.js: Configuration for oversized error logging - src/api/middleware/session.js: Add sessionId to logger context Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
I struggled with analyzing req/res to/from LLM. Therefore I created this logger tool (disabled by default). Helps me in finding bugs with incorrect characters, broken JSONs or other things. I'd appreciate it if this went in to the repo for simple use in the future.
If changes are requested I'm happy to add them (but having the PR accepted 1st would also be great ;))
Summary
Adds two complementary logging systems for better debugging and monitoring:
Four-commit series:
Audit Logging Features
Oversized Error Logging Features
Configuration
Audit Logging:
Oversized Error Logging:
Testing
All tests pass. Audit logging shows 48% space savings on reference format. Oversized error logging tested with various error scenarios.