Add intelligent audit logging and oversized error logging systems #20

MichaelAnders · 2026-01-23T17:49:14Z

I struggled with analyzing req/res to/from LLM. Therefore I created this logger tool (disabled by default). Helps me in finding bugs with incorrect characters, broken JSONs or other things. I'd appreciate it if this went in to the repo for simple use in the future.

If changes are requested I'm happy to add them (but having the PR accepted 1st would also be great ;))

Summary

Adds two complementary logging systems for better debugging and monitoring:

Audit logging system - Logs LLM requests/responses with intelligent deduplication (60-70% size reduction)
Oversized error logging - Captures errors with large fields to separate session-based files

Four-commit series:

Core deduplication infrastructure - Logs all LLM requests/responses (model, hostname, IP, timestamps, token counts). First occurrence logged with SHA256 hash, subsequent occurrences reference hash. Full content in separate dictionary file. LLM_AUDIT_ENABLED defaults to false (opt-in, inactive by default)
Content sanitization - Removes empty "User:" entries, 20% additional space savings
Advanced features - Hash-before-truncate, session tracking, hash annotations, aggressive truncation, loop detection logging, dictionary cleanup utility
Oversized error logging - Custom Pino stream captures errors with fields > 200 characters to session-based JSONL files. Preserves full error context without truncation. Enabled by default, configurable

Audit Logging Features

Content-addressable deduplication (50-70% log size reduction)
Logs req/res with model, hostname, IP, timestamps, token counts
First entry shown in log with SHA256 hash, cleartext in separate file for reference
Backward compatible
Opt-in by default (LLM_AUDIT_ENABLED defaults to false)
Comprehensive tooling (reader, tests, cleanup script)
Performance optimized (LRU cache, async writes)

Oversized Error Logging Features

Captures errors with fields > 200 characters (configurable threshold)
Session-based JSONL log files (one file per session)
Captures WARN and ERROR levels only
Full error context preserved without truncation
Graceful degradation if capture fails
Enabled by default, disable with OVERSIZED_ERROR_LOGGING_ENABLED=false

Configuration

Audit Logging:

LLM_AUDIT_ENABLED - Enable/disable (default: false, opt-in)
LLM_AUDIT_DEDUP_* - Deduplication settings
LLM_AUDIT_MAX_* - Truncation limits

Oversized Error Logging:

OVERSIZED_ERROR_LOGGING_ENABLED - Enable/disable (default: true)
OVERSIZED_ERROR_THRESHOLD - Field size threshold (default: 200 chars)
OVERSIZED_ERROR_LOG_DIR - Log directory (default: logs/oversized-errors)
OVERSIZED_ERROR_MAX_FILES - Max files to keep (default: 100)

Testing

All tests pass. Audit logging shows 48% space savings on reference format. Oversized error logging tested with various error scenarios.

vishalveerareddy123 · 2026-01-24T03:40:26Z

src/config/index.js

 }

-const SUPPORTED_MODEL_PROVIDERS = new Set(["databricks", "azure-anthropic", "ollama", "openrouter", "azure-openai", "openai", "llamacpp", "lmstudio", "bedrock", "zai", "vertex"]);
+const SUPPORTED_MODEL_PROVIDERS = new Set(["databricks", "azure-anthropic", "ollama", "openrouter", "azure-openai", "openai", "llamacpp", "lmstudio", "bedrock"]);


Can you take the latest pull
We have recently added support for z.ai and vertex

Same for embedding models

Can you take the latest pull We have recently added support for z.ai and vertex

Fixed

All issues resolved.

I added another commit to the PR as it fits perfectly here. With these changes in your repo I'd be up-to-date and would continue checking/enhancing/fixing Lynkr.

Logs all LLM requests/responses including model, hostname, IP, timestamps, and token counts. First occurrence of content is logged in full with SHA256 hash, subsequent occurrences reference the hash. Full content stored in separate dictionary file for lookup. Keeps log files small. - LLM_AUDIT_ENABLED defaults to false (opt-in, inactive by default) - Deduplication enabled by default when logging is active - Includes reader utility and test suite New files: - src/logger/deduplicator.js - src/logger/audit-logger.js - scripts/audit-log-reader.js - scripts/test-deduplication.js Modified: - src/config/index.js: LLM_AUDIT_ENABLED defaults to false Test results: 48% space savings

Implements intelligent content cleaning that removes wasteful empty "User:" entries from LLM audit log content before storage in the deduplication dictionary. Empty User: entries appear in conversation logs and serve no purpose and bloating the log dictionary. Root cause for empty "User:" entries outstanding - should be investigated to reduce tokens wasted in requests. Changes: - Add _sanitizeContent() method to ContentDeduplicator that detects and removes empty "User:" entries while preserving non-empty ones - Integrate sanitization into storeContent() to clean content before hashing - Add LLM_AUDIT_DEDUP_SANITIZE config option (default: true) - Pass sanitize option through config chain to deduplicator - Add comprehensive unit tests for sanitization behavior Benefits: - Reduces stored content size by ~20% for affected entries - Saves 10-15 tokens per empty User: entry removed - Cleaner, more semantically meaningful dictionary content - Can be disabled via LLM_AUDIT_DEDUP_SANITIZE=false if needed Tests: All 6 deduplication tests pass, including 2 new sanitization tests

This commit implements 6 major phases of audit logging improvements: **Phase 1: Hash-Before-Truncate Strategy** - Hash original content BEFORE truncation to preserve full content tracking - Add hashAndTruncate() and hashAndTruncateSystemReminder() functions - Add storeContentWithHash() method to deduplicator - Ensures dictionary stores hashes of original (not truncated) content **Phase 2: Session-Level Hash Tracking** - Add sessionContentCache to track hashes seen in current session - First occurrence in session logs full (truncated) content - Subsequent occurrences log only hash references - Add isFirstTimeInSession() and markSeenInSession() methods - Significantly reduces log file size within a session **Phase 3: Hash Annotation Lines** - Add logHashAnnotation() function to output annotation lines - Annotations include hash values and lookup instructions - Format: {"_annotation": true, "systemPromptHash": "...", "lookup": "..."} - Makes it easy to find and decode hash references **Phase 4: Aggressive Truncation Limits** - Change maxContentLength from single value to object with type-specific limits - systemPrompt: 2000 chars (down from 5000) - userMessages: 3000 chars - response: 3000 chars - Add environment variables: LLM_AUDIT_MAX_SYSTEM_LENGTH, etc. - Expected 60-70% log size reduction **Phase 5: Enhanced Loop Detection Logging** - Add more structured logging when loop detected (count === 3) - At termination (count > 3), log FULL context for debugging: - myPrompt: Full conversation sent to LLM - systemPrompt: Full system prompt - llmResponse: Full LLM response - repeatedToolCalls: The actual repeated tool calls - toolCallHistory: Full history of all tool calls - Add correlationId, action, totalSteps metadata - Critical for debugging why loops occur **Phase 6: Dictionary Cleanup Script** - Create scripts/compact-dictionary.js - Removes redundant UPDATE entries from dictionary - Keeps only latest metadata with full content per hash - Supports --dry-run, --backup, --no-backup options - Reports statistics on size reduction Configuration changes: - Add LLM_AUDIT_MAX_SYSTEM_LENGTH (default: 2000) - Add LLM_AUDIT_MAX_USER_LENGTH (default: 3000) - Add LLM_AUDIT_MAX_RESPONSE_LENGTH (default: 3000) - Add LLM_AUDIT_ANNOTATIONS (default: true) - Add LLM_AUDIT_DEDUP_SESSION_CACHE (default: true) Expected Impact: - Log file size: 60-70% reduction - Readability: Significantly improved - Debugging: Much easier with hash annotations - Loop visibility: Full context captured for analysis

- Add custom Pino stream to capture errors with fields > 200 characters - Store oversized errors in session-based log files (JSONL format) - One file per session, all oversized errors from same session append to single file - Errors captured at WARN and ERROR levels only - Full error context preserved without truncation - Configuration options for threshold, max files, and log directory - Session ID added to logger context in session middleware - Graceful degradation if capture fails (main logging continues) Files changed: - src/logger/oversized-error-stream.js: Core stream implementation - src/logger/index.js: Multistream setup for dual logging - src/config/index.js: Configuration for oversized error logging - src/api/middleware/session.js: Add sessionId to logger context Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

vishalveerareddy123 reviewed Jan 24, 2026

View reviewed changes

MichaelAnders force-pushed the audit-logging-contrib branch from 3bf3362 to 9afe2f3 Compare January 24, 2026 11:01

MichaelAnders added 2 commits January 24, 2026 12:34

MichaelAnders force-pushed the audit-logging-contrib branch from 9afe2f3 to 1fa9cab Compare January 24, 2026 11:35

MichaelAnders changed the title ~~Add intelligent audit logging system with deduplication~~ Add intelligent audit logging and oversized error logging systems Jan 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add intelligent audit logging and oversized error logging systems #20

Add intelligent audit logging and oversized error logging systems #20

Uh oh!

MichaelAnders commented Jan 23, 2026 •

edited

Loading

Uh oh!

vishalveerareddy123 Jan 24, 2026

Uh oh!

vishalveerareddy123 Jan 24, 2026

Uh oh!

MichaelAnders Jan 24, 2026

Uh oh!

MichaelAnders Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add intelligent audit logging and oversized error logging systems #20

Are you sure you want to change the base?

Add intelligent audit logging and oversized error logging systems #20

Uh oh!

Conversation

MichaelAnders commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Four-commit series:

Audit Logging Features

Oversized Error Logging Features

Configuration

Testing

Uh oh!

vishalveerareddy123 Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

vishalveerareddy123 Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

MichaelAnders Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

MichaelAnders Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MichaelAnders commented Jan 23, 2026 •

edited

Loading