ELSA - Easy Log Search Archival
Cold layer archival search for SOC teams. Built on the same philosophy as DES: S3 as the only source of truth, fully stateless compute nodes, zero mandatory external databases.
"Let it go" — once data reaches the archive, it stays there. Immutable, verifiable, searchable.
Every SOC team faces the same tension: logs must be kept for years, but storage systems designed for search were never designed for retention — and storage systems designed for retention were never designed for search.
The typical architecture that emerges:
- Wazuh or Splunk for real-time alerting — fast, expensive, short retention
- Elasticsearch for hot search — fast, very expensive, short retention
- Cold storage (S3, tape, NFS) for compliance — cheap, but unsearchable
When a SOC analyst needs to investigate an incident from 6 months ago — a compromised IP address, a lateral movement trail, a suspicious username — they face a wall: the hot systems have already purged the data, and the cold archive has no index. The answer to "what did 185.220.101.42 do in February?" requires hours of manual log retrieval, decompression, and grep.
ELSA solves this. It is the missing layer: cheap S3 storage with a searchable index, compliance-grade immutability, and a query model built for SOC workflows.
ELSA IS:
- An archival log storage system with entity-centric search (IP, username, hostname, session ID)
- A multi-stream query system — SOC analysts work across log streams simultaneously, and ELSA is designed for this
- A compliance-grade archive (S3 Object Lock / WORM, GDPR tombstone deletion, audit trail)
- A natural extension of logRotate semantics — data flows in when hot systems are done with it
- A complement to Wazuh/Elastic, not a competitor
ELSA IS NOT:
- A real-time alerting or correlation engine (that's Wazuh's job)
- A full-text search engine (that's Elasticsearch's job)
- A replacement for your hot storage layer
- A streaming analytics platform
T+0 → T+72h T+72h → T+30d T+30d → ∞
┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐
│ WAZUH + Elastic │ │ Wazuh hot storage │ │ ELSA │
│ │ │ Fast DBs │ │ │
│ Real-time alerts │ │ Fast search │ │ Archive + Search │
│ Correlation rules │ │ Last 30 days │ │ Compliance/WORM │
│ Active response │ │ │ │ Entity lookup │
└────────────────────┘ └────────────────────┘ └────────────────────┘
↑
logRotate feeds data here
ELSA inherits the core principle from DES: S3 is the only source of truth. Every other component is either a cache or a compute node — destroyable and reconstructible at any time.
The implication: Redis (the metadata cache) can be wiped and fully rebuilt from S3 manifests in minutes. No data lives exclusively in Redis. No PostgreSQL is required.
CONTINUOUS WRITE (no locking, no conflicts)
Ingestors → micro-splits → s3://staging/{stream}/{today}/
NIGHTLY COMPACTION (single writer, zero race conditions)
Nightly Job → merge → index → promote to archive → rebuild Redis
QUERY (stateless, any node can serve any query)
Redis metastore → identify candidate splits → S3 Range-GET → results
A core design requirement: SOC analysts pivot across streams simultaneously. An entity timeline for IP 185.220.101.42 must return events from audit_logs, firewall_logs, and app_logs in a single request, merged and sorted by timestamp.
ELSA implements this through a cluster-level Redis namespace that maintains a cross-stream entity index, and a query planner that fans out to all relevant streams in parallel, merging results before returning to the caller. This is a first-class design concern, not an afterthought.
All ingestion endpoints require authentication. Unauthenticated writes to staging are a data poisoning vector — data ingested today becomes WORM-locked archive tomorrow.
- Syslog over TLS: mutual TLS with per-source certificates
- HTTP POST: Bearer token (HMAC, same model as DES)
- Kafka: SASL/SCRAM per consumer group
- Source identity is embedded in the split metadata and audit trail
- Staging: mutable — ingestors can correct errors before compaction
- Archive: immutable — S3 Object Lock applied immediately after promotion
Resolved through tombstone-based deletion. Physical deletion via repack is only possible in GOVERNANCE mode.
When a repack operation physically removes tombstoned records, it creates new splits with new hashes — which would break the cryptographic chain for all subsequent splits. ELSA resolves this with repack anchor entries in the audit trail:
REPACK_ANCHOR entry:
old_split_id → hash of original split
new_split_id → hash of repacked split
tombstone_ids → list of removed doc_ids
anchor_hash → sha256(previous_chain_hash + old_split_hash + new_split_hash)
Auditors verify integrity using the repack-aware chain verifier, which treats anchor points as valid chain continuations. The chain is unbroken — it has documented mutations.
The s3:BypassGovernanceRetention IAM permission required for GDPR repack is treated as a privileged, break-glass operation:
- Held by a dedicated IAM role, not by any service account used in normal operations
- Requires MFA authentication before assumption
- Every usage generates a CloudTrail alert to the CISO
- The permission is never embedded in application configuration — it is retrieved from OpenBao at repack time with a short-lived token
QUERY: src_ip = 1.2.3.4, time: last 3 weeks, streams: all
LAYER 0: Redis cluster entity index (0 S3 GETs)
→ "which streams × weekly segments contain this IP?" → cross-stream map
LAYER 1: Bloom filter per split (1 Range-GET per split, hotcache section)
→ probabilistic elimination, ~1% FPR
LAYER 2: Inverted index posting list (1 Range-GET per qualifying split)
→ exact list of doc_ids → fetch records → merge across streams → return
Previously
.ldes. Renamed to.elsafor consistency with the project name.
Each archive unit is a self-contained binary file stored on S3. The format is versioned from v1, with a version byte in the magic header enabling forward-compatible readers.
Split file (.elsa) — versioned binary format
┌───────────────────────────────────────┐
│ MAGIC (4B: "ELSA") + VERSION (1B) │ ← version enables format evolution
│ HEADER: stream, time_range, schema │
├───────────────────────────────────────┤
│ HOTCACHE SECTION (~50–200KB) │
│ Bloom filters (versioned, own format)│ ← NOT Java serialization
│ Sparse index (every 256th record) │
│ Column min/max statistics │
├───────────────────────────────────────┤
│ COLUMNAR DATA SECTION (zstd) │
├───────────────────────────────────────┤
│ INVERTED INDEX SECTION │
│ entity_value → posting list │
├───────────────────────────────────────┤
│ FOOTER (last 32B) │
│ Section offsets + CRC32 │
└───────────────────────────────────────┘
Bloom filter serialization: ELSA uses its own binary Bloom filter format (not Guava's Java serialization) to ensure library-version independence. Format is documented in docs/bloom-filter-format.md.
s3://logs-bucket/
├── _catalog/
│ ├── cluster.json ← cluster-wide stream registry + cross-stream entity index roots
│ └── {stream}/
│ ├── current ← pointer to active snapshot (updated with S3 conditional PUT)
│ ├── snap_{N}.json
│ └── manifests/
│ └── man_{YYYY-WNN}.json
├── splits/
│ └── {stream}/{YYYY}/{WNN}/
│ └── {split_id}.elsa ← archive splits (WORM-locked)
├── indexes/
│ ├── {stream}/{YYYY-WNN}/
│ │ └── ip_index.idx ← per-stream weekly index
│ └── _cross_stream/{YYYY-WNN}/
│ └── entity_index.idx ← cross-stream entity index (NEW)
├── staging/
│ └── {stream}/{YYYY}/{MM}/{DD}/
│ └── {micro_split_id}.elsa
├── tombstones/
│ └── {stream}/{request_id}.json ← doc_ids stored as opaque hashes, not raw IDs
└── audit-trail/
└── {stream}/{YYYY}/{MM}/
└── audit_{DD}.jsonl
| Epic | Title | Key changes vs original design |
|---|---|---|
| EPIC-01 | Core Architecture & Split Format | Format versioning, Bloom filter own binary format |
| EPIC-02 | Ingestor & Schema Normalization | Ingestor auth (mTLS/HMAC/SASL), schema evolution strategy |
| EPIC-03 | Nightly Compaction & Index Build | Job overlap guard, cross-stream index build, S3 conditional PUT |
| EPIC-04 | Redis Metastore | Cluster namespace, eviction policy for high-cardinality, DR runbook |
| EPIC-05 | Query Engine | Cross-stream fan-out, rate limiting, staging scan cap |
| EPIC-06 | Compliance Layer | Chain hash with repack anchors, BypassGovernanceRetention hardening, tombstone privacy, entity-scoped legal hold |
| EPIC-07 | logRotate Integration | Authenticated import, format unchanged |
| EPIC-08 | SOC API & Query Interface | Cross-stream timeline, rate limiting headers |
Estimated scope: ~380 story points (additional ~37 SP for risk mitigations).
MVP (v0.2.0): EPIC-01 through EPIC-04 plus EPIC-07.
| Component | Technology | Rationale |
|---|---|---|
| Runtime | Java 21 + Quarkus | Aligned with DES 2.0 |
| Object storage | S3-compatible (AWS, MinIO, Ceph/RGW) | Same as DES |
| Metadata cache | Redis 7.x | Atomic Lua scripts; sorted sets for time-range |
| Compression | zstd level 3 | Columnar data benefits significantly |
| Bloom filter | Custom binary format (inspired by Guava) | Library-version independence |
| Posting list codec | VByte delta encoding | 1–2 bytes per doc_id typical |
| Secrets management | OpenBao (Vault fork) | BypassGovernanceRetention token management |
| Container orchestration | Kubernetes | CronJob for nightly compaction |
| Observability | Prometheus + Grafana | Nightly job success/failure alerts included |
- S3 is the only source of truth. Redis can be wiped and rebuilt at any time.
- D+1 model eliminates race conditions. One writer per manifest per night.
- Multi-stream query is first class. SOC analysts do not work with one stream at a time.
- Ingestor authentication is mandatory. Unauthenticated writes are a data poisoning vector.
- Bloom filters use own binary format. Not Guava Java serialization — library-version independent.
- Chain hash survives repack via anchor entries. GDPR deletion does not break audit trail integrity.
- BypassGovernanceRetention is break-glass. MFA, short-lived token, CloudTrail alert.
- Tombstones store hashed doc_ids. No raw personal data identifiers in audit trail.
ELSA is a sibling project to DES, developed by the same team at Datavision.pl.
Apache License 2.0
Datavision.pl — data science consultancy and infrastructure tooling. Project status: design phase. Implementation begins Q2 2026.