Skip to content

PatrickKoss/database-internals-visualized

Repository files navigation

Database Concepts Visualizations

License: MIT TypeScript React Vite Tailwind CSS PRs Welcome

37 interactive visualizations for teaching database internals. Step-through animations covering indexing, encoding, replication, partitioning, transactions, and consensus algorithms.

Getting Started · Topics · Contributing · License

Home Page

Features

  • Frame-by-frame animations rendered on HTML5 Canvas with play/pause/step/speed controls
  • Configurable parameters to experiment with different scenarios and edge cases
  • Descriptive annotations explaining what's happening at each step of the algorithm
  • Keyboard shortcuts for fast navigation through animation frames
  • 6 major topics with 37 visualizations covering core database concepts
  • Pure logic engines separated from rendering, making algorithms easy to follow and test

Topics

Indexing

How databases organize data on disk for efficient reads and writes.

Visualization Description
B-Tree Watch nodes split and keys get promoted as insertions fill leaf nodes
LSM Tree Writes flow into a memtable, flush to sorted SSTables, and trigger compaction
Bitcask Append-only log with an in-memory hash index for fast key lookups
Hash Map Simple hash-based indexing with collision handling
Skip List Probabilistic layered linked list for ordered key access
SSTable Sorted String Tables with sparse index and block-based reads

B-Tree Index

LSM Tree

Encoding

How data is serialized into bytes across different formats.

Visualization Description
JSON Human-readable format with field names repeated per record
Protobuf Compact binary encoding with field tags and varint compression
Avro Schema-based encoding with no field tags in the data
MessagePack Binary JSON with smaller overhead
Thrift Facebook's binary protocol with field IDs
XML Verbose markup-based serialization
Comparison Side-by-side overhead vs. data breakdown across all formats

Encoding Comparison

Replication

How data is copied across multiple nodes for fault tolerance and scalability.

Visualization Description
Single Leader All writes go through one leader, replicated to followers
Multi Leader Multiple nodes accept writes with conflict resolution
Leaderless Quorum-based reads and writes across all replicas
Chain Replication Writes propagate head-to-tail for strong consistency
CRDTs Conflict-free replicated data types that merge automatically

Single-Leader Replication

Partitioning

How data is distributed across multiple nodes for horizontal scaling.

Visualization Description
Hash-Based Keys distributed by hash value across fixed partitions
Key Range Contiguous key ranges assigned to different nodes
Consistent Hashing Ring-based distribution with minimal redistribution on changes
Hot Partitions Visualize skewed access patterns and their effects
Rebalancing Watch partitions move between nodes during cluster changes

Consistent Hashing

Transactions

Isolation levels and anomalies in concurrent database operations.

Visualization Description
MVCC Multi-version concurrency control with snapshot reads
Dirty Read Reading uncommitted data from another transaction
Non-Repeatable Read Same query returns different results within a transaction
Phantom Read New rows appear between repeated range queries
Write Skew Concurrent transactions invalidate each other's assumptions
Lost Update Two transactions overwrite each other's changes
Deadlock Detection Cycle detection in wait-for graphs
Isolation Levels Compare Read Uncommitted through Serializable side-by-side

MVCC

Consensus

How distributed nodes agree on values, ordering, and consistency.

Visualization Description
Raft Leader election, log replication, and term-based consensus
Paxos Prepare/accept phases for single-value agreement
Total Order Broadcast Reliable delivery with global message ordering
Linearizability Visualize real-time ordering constraints on operations
2PC / 3PC Two-phase and three-phase commit protocols
Logical Clocks Lamport and vector clocks for causal ordering
PBFT Byzantine fault tolerance with pre-prepare/prepare/commit phases

Raft Consensus

Getting Started

git clone https://github.com/PatrickKoss/lecture-database-concepts-visualizations.git
cd lecture-database-concepts-visualizations
make install   # Install dependencies
make dev       # Start development server at http://localhost:5173

Available Commands

Command Description
make dev Start Vite dev server with HMR
make build TypeScript check + production build to dist/
make lint Run ESLint
make test Run Vitest test suite (39 test files)
make fmt Format code with Prettier
make install Install npm dependencies
make clean Remove dist, node_modules, coverage

Project Structure

src/
  visualizations/        # All visualization engines and renderers
    indexing/            # B-Tree, Bitcask, HashMap, LSM Tree, Skip List, SSTable
    encoding/            # JSON, Protobuf, XML, Avro, MessagePack, Thrift
    replication/         # Single-leader, Multi-leader, Leaderless, Chain, CRDTs
    partitioning/        # Hash-based, Key-range, Consistent Hashing, Rebalancing
    transactions/        # Isolation anomalies, MVCC, Deadlock Detection
    consensus/           # Raft, Paxos, Total Order Broadcast, 2PC/3PC, PBFT
  components/            # Shared UI (sidebar, animation controls, config panel)
  hooks/                 # useAnimationController, useCanvas, useKeyboardShortcuts
  lib/                   # Canvas drawing utilities and color palette
  routes/                # TanStack Router file-based routes

Each visualization follows the Engine + Renderer pattern:

  • engine.ts contains pure logic and exports generateFrames(config) producing frame snapshots
  • <Name>Visualization.tsx wires the engine to React with Canvas rendering and playback controls

Built With

Contributing

Contributions are welcome. To add a new visualization:

  1. Create src/visualizations/<topic>/<name>/engine.ts with types and generateFrames() function
  2. Create src/visualizations/<topic>/<name>/<Name>Visualization.tsx following the Engine + Renderer pattern
  3. Add a tab entry in the corresponding src/routes/<topic>.tsx route file
  4. Run make lint && make test to verify everything passes

See the project structure section for how visualizations are organized.

Acknowledgments

Inspired by Designing Data-Intensive Applications by Martin Kleppmann. The visualizations in this project aim to bring the concepts from that book to life through interactive, step-through animations.

License

Distributed under the MIT License. See LICENSE for details.

About

Interactive visualizations for database internals — indexing, replication, partitioning, transactions, consensus, and encoding — with step-through Canvas animations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages