Skip to content

Conversation

@markoceri
Copy link
Collaborator

Summary

This pull request upgrades the data management layer of the application by implementing a robust migration system. By transitioning from manual SQLite schema changes to a structured system using SQLAlchemy ORM with Alembic migrations, this approach allows for safe database evolution without the risk of data loss.

Key Drivers

Previously, changing any entity property required manual SQL intervention, which often led to data being wiped if the schema became incompatible. To support the long-term growth of the project and protect existing data, I have implemented an automated migration workflow that:

  • Preserves data integrity during schema changes
  • Provides version control for database structure
  • Enables safe rollback to previous states
  • Maintains clean separation between domain logic and persistence (DDD/Hexagonal Architecture)

Key Improvements

1. Data Preservation Through Alembic Migrations

I integrated Alembic to handle schema evolutions automatically. The system now:

  • Tracks all database schema changes through versioned migration files
  • Applies migrations incrementally, transforming the database to the new version while keeping all existing records intact
  • Detects pending migrations and applies only necessary changes
  • Enforces a migration-first approach: all schema changes must go through Alembic (no manual SQL)

Reference: See docs/ALEMBIC_MIGRATIONS.md for complete migration system documentation.

2. Automated Backups for Safety

As an extra layer of security, I introduced an automatic backup routine that:

  • Creates a timestamped database backup before applying any migration
  • Uses format: dbname_backup_YYYYMMDD_HHMMSS.db
  • Only triggers when there are pending migrations (skips backup if database is up-to-date)
  • Can be enabled/disabled via BACKUP_BEFORE_MIGRATION environment variable
  • Currently supports SQLite databases

Configuration:

# .env file
BACKUP_BEFORE_MIGRATION=true  # Enable automatic backups (default: true)
RUN_MIGRATIONS_ON_STARTUP=true  # Run migrations on app startup (default: true)

Reference: See Startup Workflow section in the migration guide.

3. Standardized Access with SQLAlchemy ORM

By using SQLAlchemy with imperative mapping, I have:

  • Moved to a robust Object-Relational Mapping (ORM) approach
  • Kept domain entities completely clean and independent of persistence details
  • Implemented a 4-phase conversion system for complex value objects:
    • Load phase: Convert database primitives to domain types (EntityId, enums, value objects)
    • Before insert/update: Flatten value objects to primitives or serialize to JSON
    • After insert/update: Restore domain types after persistence
  • Maintained strict adherence to DDD and Hexagonal Architecture principles

Technical Details:

  • Domain entities remain pure @dataclass objects
  • SQLAlchemy mapping happens separately in adapter layer
  • Event listeners handle all type conversions transparently
  • Repositories work directly with domain entities

Reference: See DDD Principles section and Migration Example.

4. Future-Proofing and Developer Experience

This setup removes the "fear of updating" and enables:

  • Rapid feature development: Add or modify entity properties without data loss concerns
  • Safe experimentation: Rollback capability if migrations encounter issues
  • Team collaboration: Version-controlled migrations prevent conflicts
  • Multi-database support: Easy transition to PostgreSQL, MySQL, or other databases
  • CI/CD integration: Automated migration checks and testing

Developer Workflow:

# 1. Modify domain entity and table definition
# 2. Generate migration automatically
python scripts/migrate.py create "Add new field"

# 3. Review generated migration
cat alembic/versions/xxx_add_new_field.py

# 4. Apply migration (automatic on startup or manual)
python -m edge_mining  # Automatic
python scripts/migrate.py upgrade  # Manual

Reference: See Practical Example for complete step-by-step workflow.

Key Updates

1. SQLAlchemy Implementation

Replaced raw SQLite queries with SQLAlchemy models:

  • Implemented imperative mapping for all domain entities
  • Created type-safe table definitions in adapter layer
  • Added custom SQLAlchemy types for enums (e.g., MinerStatusType, EnergySourceType)
  • Implemented event listeners for value object conversions

Files Changed:

  • edge_mining/adapters/domain/*/tables.py - Table definitions and mappings
  • edge_mining/adapters/domain/*/repositories.py - SQLAlchemy repositories
  • edge_mining/adapters/infrastructure/persistence/sqlalchemy/ - Infrastructure layer

2. Alembic Initialization

Initialized Alembic for version control of database schema:

  • Configured alembic.ini with proper paths and settings
  • Created alembic/env.py that integrates with SQLAlchemy metadata
  • Configured migration template (script.py.mako)
  • Generated initial migration with all existing tables

3. Pre-Migration Backup Hook

Added automatic database backup before migrations:

  • Implemented in BaseSQLAlchemyRepository.initialize_database()
  • Creates timestamped backups only when migrations are pending
  • Integrated with startup workflow for zero-configuration safety
  • Configurable via environment variables

4. Automatic Migration on Startup

Updated deployment and startup scripts:

  • Modified bootstrap.py to use initialize_database() method
  • Added configuration options for migration behavior
  • Implemented smart migration detection (skips if up-to-date)
  • Created CLI tool scripts/migrate.py for manual migration management

CLI Commands:

# Check migration status
python scripts/migrate.py status

# View migration history
python scripts/migrate.py history

# Apply migrations manually
python scripts/migrate.py upgrade

# Rollback migrations
python scripts/migrate.py downgrade [n]

# Create new migration
python scripts/migrate.py create "Description"

Testing

Comprehensive test coverage added:

Unit Tests (42 tests)

  • tests/unit/adapters/domain/energy/test_tables_event_listeners.py
    • Event listener behavior for all conversion phases
    • Value object serialization/deserialization
    • EntityId and enum conversions

Integration Tests (34 tests)

  • tests/integration/adapters/persistence/test_sqlalchemy_energy_repositories.py (21 tests)

    • Full CRUD operations with real database
    • Complex queries and relationships
  • tests/integration/adapters/persistence/test_alembic_migrations.py (9 tests)

    • Migration system validation
    • Upgrade/downgrade workflows
    • Backup creation
  • tests/integration/adapters/persistence/test_e2e_persistence.py (8 tests)

    • End-to-end persistence scenarios
    • Multi-entity workflows

Run tests:

# All tests
pytest

# Only migration tests
pytest tests/integration/adapters/persistence/test_alembic_migrations.py

# With coverage
pytest --cov=edge_mining --cov-report=html

Documentation

Complete documentation has been added:

  1. docs/ALEMBIC_MIGRATIONS.md

    • Complete guide to the migration system
    • Configuration options and environment variables
    • Startup workflow and automatic migration process
    • CLI commands and manual migration management
    • Best practices for development and production
    • Troubleshooting common issues
    • CI/CD integration examples
  2. docs/MIGRATION_EXAMPLE.md

    • Practical step-by-step example: adding a "temperature" field to miners
    • Domain entity modification
    • Table definition updates
    • Migration generation and verification
    • Testing the changes
    • Rollback procedures
    • Complex migration scenarios (rename, NOT NULL, indexes)

Configuration

Environment Variables

Add to your .env file:

# Database configuration
DB_PATH=sqlite:///edgemining.db
# For PostgreSQL: DB_PATH=postgresql://user:pass@localhost/dbname
# For MySQL: DB_PATH=mysql+pymysql://user:pass@localhost/dbname

# Migration settings
RUN_MIGRATIONS_ON_STARTUP=true  # Automatic migrations on startup
BACKUP_BEFORE_MIGRATION=true    # Create backup before migrations (SQLite only)

# Persistence adapter
PERSISTENCE_ADAPTER=sqlalchemy

Settings

Configuration in edge_mining/shared/settings/settings.py:

class AppSettings(BaseSettings):
    persistence_adapter: str = "sqlalchemy"
    db_path: str = "sqlite:///edgemining.db"
    run_migrations_on_startup: bool = True
    backup_before_migration: bool = True

Migration from Previous Version

For existing installations:

Automatic Migration (Recommended):

  1. Backup your current database manually (just in case)
  2. Pull the latest code
  3. Start the application:
    python -m edge_mining

The system will automatically:

  • ✅ Detect your existing database structure
  • ✅ Generate initial migration if not present
  • ✅ Create automatic backup before any changes
  • ✅ Apply only necessary migrations
  • ✅ Preserve all existing data

Note: If your database is already up-to-date, no migrations will be applied and no backup will be created.

Manual Migration (Development Only):

If you need to manage migrations manually for development purposes:

  1. Generate initial migration (if not present):
    alembic revision --autogenerate -m "Initial schema with all tables"
  2. Apply migrations manually:
    python scripts/migrate.py upgrade

For production use, always rely on automatic migrations that run on application startup.

Rollback Plan

If issues arise after deployment:

# Check current migration status
python scripts/migrate.py status

# Rollback to previous version
python scripts/migrate.py downgrade 1

# Or rollback to specific revision
alembic downgrade <revision_id>

# Restore from backup (if needed)
cp edgemining_backup_YYYYMMDD_HHMMSS.db edgemining.db

Benefits Summary

Aspect Before After
Schema Changes Manual SQL, risk of data loss Automated migrations, data preserved
Safety No backup system Automatic pre-migration backups
Rollback Manual restore One-command rollback
Team Workflow Ad-hoc changes Version-controlled migrations
Domain Purity Mixed concerns Clean separation (DDD)
Code Quality Raw SQL queries Type-safe ORM
Future Changes High risk, slow Low risk, fast
Database Support SQLite only SQLite, PostgreSQL, MySQL, etc.

Next Steps

After merging:

  1. Monitor first production deployment with automatic migrations
  2. Verify backup creation works as expected
  3. Train team on migration workflow (see docs/MIGRATION_EXAMPLE.md)
  4. Consider setting up migration testing in CI/CD pipeline
  5. Plan migration to PostgreSQL (if desired) using existing infrastructure

This PR represents a major infrastructure improvement that will enable rapid, safe evolution of the data model while maintaining architectural integrity and data safety.

…ase initialization instructions for SQLAlchemy
…gration tests for SQLAlchemy event listeners and migration functionality
@markoceri markoceri added the enhancement New feature or request label Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants