This document merges the current repository state with the idealized goals and becomes the single source of truth. Items are marked as:
- ✅ Done
- 🚧 Partial / scaffolding present
- ⭕ Not started
- ✅ A1. Compile-time graph semantics (
directed,weighted) across storages - ✅ A2. Self-loops & multi-edges covered by tests
- ⭕ A3. Attributes (node/edge KV with typed values)
- ✅ A4. Numeric node IDs (u64/usize) in use
- ⭕ A5. Storage-agnostic iterators (common trait iterators)
- ✅ B1. AdjacencyList: CRUD + neighbor ops + tests
- ✅ B2. AdjacencyMatrix: CRUD + present bitset + tests
- ✅ B3. IncidenceMatrix: rows, sign/weight semantics, edge_set, parallel builder, tests
- ⭕ B4. CSR / ReverseCSR indices (standalone, optional overlays)
- ⭕ B5. Attribute storage (typed, columnar, interning)
- ✅ C1.
GraphStorage(type, directed, weighted)factory - ✅ C2. Core ops: init/deinit/add/remove/has/getNeighbors
- ⭕ C3. Attribute API (node/edge getters/setters)
- ⭕ C4. Iteration policy & attribute filters
- ⭕ D1–D7. Strategies + API + tests (
Duplicate,Streamed,Chunked,CopyOnWrite)
- 🚧 E1.
.zgraphtext spec documented; runtime reader/writer pending - 🚧 E2.
.zgraphbbinary spec documented; runtime reader/writer pending
- 🚧 F1. Traversal (BFS/DFS/CC) — source files exist; unify over trait; add tests
- 🚧 F2. Shortest paths (Dijkstra/Bellman-Ford/Floyd-Warshall) — wire and test
- 🚧 F3. Connectivity (SCC) — add Tarjan/Kosaraju; tests
- 🚧 F4. Flow (Edmonds-Karp present; add Dinic); tests
- 🚧 F5. Centrality (PageRank present; add Betweenness); tests
- 🚧 F6. Spectral (Laplacian, eigen routines present); tests & trait integration
- ⭕ G1. CSR build/use
- ⭕ G2. Reverse CSR (directed)
- ⭕ G3. Degree table
- ⭕ G4. Index persistence in
.zgraphboptional blocks - ⭕ G5. Rebuild-on-demand hooks
- ⭕ H1. Node & Edge KV (
int|float|bool|string) - ⭕ H2. Storage-agnostic attribute map with typed accessors
- ⭕ H3. Text↔Binary column mapping (string table)
- ⭕ H4. Attribute filters in iterators
- ⭕ H5. Tests for mixed types, missing values, interning
- ⭕ I1. CLI subcommands:
convert,validate,build-index,stats - ⭕ I2. Streaming I/O (chunked) for both formats
- ⭕ I3. zstd dictionaries & auto-tuning
- ✅ J0. IncidenceMatrix parallel builder + mutex guarding
- ⭕ J1. Threading doc & invariants
- ⭕ J2. Parallel builders for CSR & conversions where safe
- ⭕ J3. Allocator strategy knobs (GPA/Arena/Page) + docs
- ⭕ J4. Zero-copy
.zgraphbreaders (mmap) - ⭕ J5. Stress tests (OOM behavior)
- ✅ K1. Unit tests for storages
- ⭕ K2. Property tests (round-trips, cross-storage equivalence)
- ⭕ K3. Fuzzers for text/binary parsers
- ⭕ K4. Benchmarks (micro/macro) and tracked results
- ⭕ K5. CI (Win/macOS/Linux; Zig 0.16.x)
- ⭕ K6.
zig fmt+ static checks gates
- ✅ L1. README
- 🚧 L2. Detailed specs for
.zgraph/.zgraphb(needs sync with runtime) - ⭕ L3. Algorithm docs and complexity notes
- ⭕ L4. Conversion strategy doc (memory math & decision table)
- ⭕ L5. Perf tuning guide (allocators, cache, zstd params)
- ⭕ L6. Roadmap milestones mapping to this checklist
- Define common trait surface (
GraphLike) with adapters over existing storages:neighbors(u),hasEdge(u,v),weight(u,v)?,nodeCount(),edgeCount()
- Iterator policy: stable order + attribute filter hooks; basic
NodeIter,EdgeFromIter
- Typed attribute store (node & edge):
int|float|bool|string+ string interning - Attribute API on the trait surface + tests
- Iterator filters using attributes
.zgraphreader/writer (streaming CSV-ish with schemas; tolerant of unknown sections); round-trip tests.zgraphbreader/writer (chunked blocks + zstd + mmap-friendly); round-trip & parity tests- Text↔Binary parity: load(text)->save(binary)->load(binary) == load(text)
convertStorageAPI + Duplicate strategy- Streamed strategy (destroy-as-you-go option; memory ceiling tests)
- Chunked strategy (node sharding; parallel)
- Copy-On-Write adapter (lazy migration) + background compaction
- Invariants & ceilings property tests
- CSR/ReverseCSR build & use as optional overlays for any storage
- Degree table and index persistence blocks in
.zgraphb - Rebuild-on-demand hooks
- Traversal (BFS/DFS/CC) over trait + CSR; tests across storages
- Shortest paths (Dijkstra/BF/FW) with layout decision table; tests
- Connectivity/SCC (Tarjan/Kosaraju); tests
- Flow (finish Dinic); tests
- Centrality (PageRank + Betweenness); tests
- Spectral (laplacian, eigen); tests & perf
- CLI subcommands:
convert,validate,build-index,stats - Streaming I/O for both formats + zstd dictionaries
- Documentation: conversion strategies, perf tuning, thread model; keep specs synced
- Property tests, fuzzers, benches; CI on all platforms; formatting/static checks gates
- Formats: Round-trips preserve counts, attributes, weights; unknown sections/blocks ignored.
- Conversion: Peak memory measured below ceiling; equivalence of neighbor sets/weights post-convert.
- Indices: CSR neighbors == storage neighbors; persisted indices reload correctly.
- Algorithms: Identical results across storages (given same semantics); perf within expected bounds.
- CLI: Non-zero exit on invalid files; stats match library queries; build-index regenerates CSR.
- Docs: Examples compile; decision tables match implemented behavior.