Skip to content

Benchmarks and metrics collection for launch blog post #25

@tac0turtle

Description

@tac0turtle

Summary

Once apex is functional end-to-end, collect concrete benchmarks and operational data for a single namespace to support a launch blog post. The post should tell the story of why apex exists, what problems it solves, and back it up with real numbers.

Data to collect

Storage

  • Total celestia-node disk usage for the same height range (full node)
  • Total apex disk usage for a single namespace over the same range
  • Storage reduction ratio (e.g., "400GB → 2GB = 200x reduction")
  • DB file size growth rate per day/week
  • Breakdown: headers vs blob data vs indexes

Performance

  • Backfill throughput: heights/sec, blobs/sec during historical sync
  • blob.Get latency: apex (SQLite lookup) vs celestia-node (namespace scan)
  • blob.GetAll latency at various heights (sparse vs dense namespace)
  • blob.Subscribe end-to-end latency: new block on celestia → blob delivered to consumer
  • Time to full sync from a given start height

Resource footprint

  • Memory usage: idle, during backfill, during streaming
  • CPU usage: idle, during backfill, during streaming
  • Peak memory during heaviest operation
  • Compare against celestia-node light/full node memory profile

Simplicity

  • Lines of code: apex total vs celestia-node
  • Dependencies: count of direct deps in go.mod (core module, excluding submit/)
  • Binary size
  • Config file: apex YAML vs celestia-node setup ceremony (init, keys, trusted hash, etc.)
  • Time from git clone to serving blobs (setup friction)

Reliability

  • Uptime over test period
  • Sync gap recovery time
  • Graceful restart time (shutdown → serving again)

Blog post outline

  1. The problem — celestia-node stores everything, rollups need almost nothing. Hundreds of GB for a few namespaces worth 10-20GB.
  2. What we tried / issues we hit — reference the celestia-node issues discovered during research:
    • 8-second blob retrieval times (celestia-node#4453)
    • BadgerDB corruption on ungraceful shutdown (celestia-node#3881)
    • Non-contiguous subscriptions (celestia-node#3578)
    • No namespace-scoped storage despite 3+ year roadmap item (celestia-node#2033)
    • Subscription buffer overflow with silent disconnect
    • <10% resource utilization during sync (celestia-node#4108)
  3. The approach — lightweight namespace indexer, SQLite, pluggable fetcher, drop-in API compatibility
  4. The numbers — storage, performance, memory, simplicity benchmarks from above
  5. What's next — Fiber support, gRPC, tx submission, multi-account

When

After Phase 2 is complete (API layer working, can serve ev-node). Benchmarks should be run against mainnet data with a real namespace.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions