Skip to content

Latest commit

 

History

History
147 lines (112 loc) · 14.4 KB

File metadata and controls

147 lines (112 loc) · 14.4 KB

Metrics

We collect various metrics and serve them via a Prometheus-compatible HTTP endpoint at http://<http_address>:<metrics_port>/metrics (default: http://127.0.0.1:5054/metrics).

A ready-to-use Grafana + Prometheus monitoring stack with pre-configured leanMetrics dashboards is available in lean-quickstart.

The exposed metrics follow the leanMetrics specification, with some metrics not yet implemented. We have a full list of implemented metrics below, with a checkbox indicating whether each metric is currently supported or not.

Node Info Metrics

Name Type Usage Sample collection event Labels Supported
lean_node_info Gauge Node information (always 1) On node start name, version
lean_node_start_time_seconds Gauge Start timestamp On node start

PQ Signature Metrics

Name Type Usage Sample collection event Labels Buckets Supported
lean_pq_sig_attestation_signatures_total Counter Total number of individual attestation signatures On each attestation signing
lean_pq_sig_attestation_signatures_valid_total Counter Total number of valid individual attestation signatures On each attestation signature verification
lean_pq_sig_attestation_signatures_invalid_total Counter Total number of invalid individual attestation signatures On each attestation signature verification
lean_pq_sig_attestation_signing_time_seconds Histogram Time taken to sign an attestation On each attestation signing 0.005, 0.01, 0.025, 0.05, 0.1, 1
lean_pq_sig_attestation_verification_time_seconds Histogram Time taken to verify an attestation signature On each attestation signature verification 0.005, 0.01, 0.025, 0.05, 0.1, 1
lean_pq_sig_aggregated_signatures_total Counter Total number of aggregated signatures On aggregated signature production
lean_pq_sig_aggregated_signatures_valid_total Counter Total number of valid aggregated signatures On aggregated signature verification
lean_pq_sig_aggregated_signatures_invalid_total Counter Total number of invalid aggregated signatures On aggregated signature verification
lean_pq_sig_attestations_in_aggregated_signatures_total Counter Total number of attestations included into aggregated signatures On aggregated signature production
lean_pq_sig_aggregated_signatures_building_time_seconds Histogram Time taken to build an aggregated attestation signature On aggregated signature production 0.1, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 2, 4
lean_pq_sig_aggregated_signatures_verification_time_seconds Histogram Time taken to verify an aggregated attestation signature On aggregated signature verification 0.1, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 2, 4

Block Production Metrics

Name Type Usage Sample collection event Labels Buckets Supported
lean_block_aggregated_payloads Histogram Number of aggregated_payloads in a block On block production 1, 2, 4, 8, 16, 32, 64, 128
lean_block_building_payload_aggregation_time_seconds Histogram Time taken to build aggregated_payloads during block building On block production 0.1, 0.25, 0.5, 0.75, 1, 2, 3, 4
lean_block_building_time_seconds Histogram Time taken to build a block On block production 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 0.75, 1
lean_block_building_success_total Counter Successful block builds On block production
lean_block_building_failures_total Counter Failed block builds (error building the block, signing the block root, or processing it locally) On block production failure

Fork-Choice Metrics

Name Type Usage Sample collection event Labels Buckets Supported
lean_head_slot Gauge Latest slot of the lean chain On get fork choice head
lean_current_slot Gauge Current slot of the lean chain On scrape ✅(*)
lean_safe_target_slot Gauge Safe target slot On safe target update
lean_fork_choice_block_processing_time_seconds Histogram Time taken to process block On fork choice process block 0.005, 0.01, 0.025, 0.05, 0.1, 1, 1.25, 1.5, 2, 4
lean_attestations_valid_total Counter Total number of valid attestations On validate attestation
lean_attestations_invalid_total Counter Total number of invalid attestations On validate attestation
lean_attestation_validation_time_seconds Histogram Time taken to validate attestation On validate attestation 0.005, 0.01, 0.025, 0.05, 0.1, 1
lean_fork_choice_reorgs_total Counter Total number of fork choice reorgs On fork choice reorg
lean_fork_choice_reorg_depth Histogram Depth of fork choice reorgs (in blocks) On fork choice reorg 1, 2, 3, 5, 7, 10, 20, 30, 50, 100
lean_tick_interval_duration_seconds Histogram Elapsed time between clock ticks in seconds At the start of each tick interval 0.4, 0.6, 0.75, 0.8, 0.805, 0.81, 0.815, 0.82, 0.825, 0.85, 0.9, 1.0, 1.2, 1.6
lean_gossip_signatures Gauge Number of gossip signatures in fork-choice store On gossip signatures update
lean_latest_new_aggregated_payloads Gauge Number of new aggregated payload items On latest_new_aggregated_payloads update
lean_latest_known_aggregated_payloads Gauge Number of known aggregated payload items On latest_known_aggregated_payloads update
lean_committee_signatures_aggregation_time_seconds Histogram Time taken to aggregate committee signatures On committee signatures aggregation 0.05, 0.1, 0.25, 0.5, 0.75, 1, 2, 3, 4
lean_node_sync_status Gauge Node sync status On node sync status change status=idle,syncing,synced

State Transition Metrics

Name Type Usage Sample collection event Labels Buckets Supported
lean_latest_justified_slot Gauge Latest justified slot On state transition
lean_latest_finalized_slot Gauge Latest finalized slot On state transition
lean_justified_slot Gauge Current justified slot On state transition
lean_finalized_slot Gauge Current finalized slot On state transition
lean_finalizations_total Counter Total number of finalization attempts On finalization attempt result=success,error
lean_state_transition_time_seconds Histogram Time to process state transition On state transition 0.25, 0.5, 0.75, 1, 1.25, 1.5, 2, 2.5, 3, 4
lean_state_transition_slots_processed_total Counter Total number of processed slots On state transition process slots
lean_state_transition_slots_processing_time_seconds Histogram Time taken to process slots On state transition process slots 0.005, 0.01, 0.025, 0.05, 0.1, 1
lean_state_transition_block_processing_time_seconds Histogram Time taken to process block On state transition process block 0.005, 0.01, 0.025, 0.05, 0.1, 1
lean_state_transition_attestations_processed_total Counter Total number of processed attestations On state transition process attestations
lean_state_transition_attestations_processing_time_seconds Histogram Time taken to process attestations On state transition process attestations 0.005, 0.01, 0.025, 0.05, 0.1, 1

Validator Metrics

Name Type Usage Sample collection event Labels Buckets Supported
lean_validators_count Gauge Number of validators managed by a node On scrape ✅(*)
lean_is_aggregator Gauge Validator's is_aggregator status. True=1, False=0 On node start
lean_attestations_production_time_seconds Histogram Time taken to produce attestation On attestation production 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 0.75, 1

Network Metrics

Name Type Usage Sample collection event Labels Supported
lean_attestation_committee_count Gauge Number of attestation committees On node start
lean_attestation_committee_subnet Gauge Node's attestation committee subnet On node start
lean_connected_peers Gauge Number of connected peers On scrape client=ethlambda,grandine,lantern,lighthouse,qlean,ream,zeam ✅(*)
lean_gossip_mesh_peers Gauge Number of peers in the gossipsub mesh On scrape client=<name>_<N>,unknown (ex. zeam_0) ✅(*)
lean_peer_connection_events_total Counter Total number of peer connection events On peer connection direction=inbound,outbound
result=success,timeout,error
lean_peer_disconnection_events_total Counter Total number of peer disconnection events On peer disconnection direction=inbound,outbound
reason=timeout,remote_close,local_close,error

Custom Metrics (non-leanMetrics)

The metrics below are not part of the leanMetrics specification. They are ethlambda-specific observability around on-wire message sizes and post-quantum aggregated proof sizes.

PQ Signature Sizes

Name Type Usage Sample collection event Labels Buckets
lean_aggregated_proof_size_bytes Histogram Bytes size of an aggregated signature proof's proof_data field On aggregated signature production 1024, 4096, 16384, 65536, 131072, 262144, 524288, 1048576

Network Sizes

Name Type Usage Sample collection event Labels Buckets
lean_gossip_block_size_bytes Histogram Bytes size of a gossip block message (raw SSZ or snappy on-wire) On gossip block send/receive compression=raw,snappy 10000, 50000, 100000, 250000, 500000, 1000000, 2000000, 5000000
lean_gossip_attestation_size_bytes Histogram Bytes size of a gossip attestation message (raw SSZ or snappy on-wire) On gossip attestation send/receive compression=raw,snappy 512, 1024, 2048, 4096, 8192, 16384
lean_gossip_aggregation_size_bytes Histogram Bytes size of a gossip aggregated attestation message (raw SSZ or snappy on-wire) On gossip aggregation send/receive compression=raw,snappy 1024, 4096, 16384, 65536, 131072, 262144, 524288, 1048576
lean_reqresp_request_size_bytes Histogram Bytes size of a req/resp request (raw SSZ or snappy on-wire) On req/resp request send/receive protocol=status,blocks_by_root
compression=raw,snappy
64, 128, 256, 512, 1024, 4096, 16384, 65536
lean_reqresp_response_chunk_size_bytes Histogram Bytes size of a single req/resp response chunk (raw SSZ or snappy on-wire) On req/resp response chunk send/receive protocol=status,blocks_by_root
compression=raw,snappy
128, 1024, 10000, 100000, 500000, 1000000, 5000000, 10000000

Storage

Name Type Usage Sample collection event Labels
lean_table_bytes Gauge Estimated byte size of a storage table (key + value bytes) After each processed block (one update per table); retains its previous value on empty slots table=<table_name>

Attestation Aggregate Coverage

Observability into how many validators/subnets are covered by the attestations the node has aggregated, broken down by pipeline section (the section label). The slot is the X-axis. These are sampled roughly once per slot, but emission is gated by the section's source data, so a gauge can retain its previous value:

  • timely, late, block, combined and the diff_validators directions are emitted on block import, and only when the canonical head block carries that round's votes (otherwise the round is skipped and prior values are kept).
  • agg_start_new is emitted at interval 2, right before fork-choice aggregation runs.
  • proposal_combined is emitted only when this node proposes a block.
Name Type Usage Sample collection event Labels
lean_attestation_aggregate_coverage_validators Gauge Validator coverage in attestation aggregate reports Per round, per section (see note above) section=timely,late,block,combined,agg_start_new,proposal_combined
subnet=combined (per-subnet breakdown reserved, not yet populated)
lean_attestation_aggregate_coverage_subnets Gauge Number of covered subnets in attestation aggregate reports Per round, per section (see note above) section=timely,late,block,combined,agg_start_new,proposal_combined
lean_attestation_aggregate_coverage_diff_validators Gauge Validators in the symmetric difference between block-included aggregates and locally-aggregated timely aggregates for the same slot On block import, when the head carries the round's votes (see note above) direction=block_only,timely_only

✅(*) Partial support: These metrics are implemented but not collected "on scrape" as the spec requires. They are updated on specific events (e.g., on tick, on block processing) rather than being computed fresh on each Prometheus scrape.

Troubleshooting

Docker Desktop on MacOS

lean-quickstart uses the host network mode for Docker containers, which is a problem on MacOS. To work around this, enable the "Enable host networking" option in Docker Desktop settings under Resources > Network.