This guide explains the different database backends supported by Sync2NAS and how to configure them.
Sync2NAS uses a database factory pattern to support multiple database backends. This allows you to choose the database that best fits your needs and scale.
SQLite is the default and recommended database for most users. It's lightweight, requires no server setup, and stores all data in a single file.
- Simple Setup: No server installation required
- Portable: Single file contains all data
- Reliable: ACID compliant with excellent data integrity
- Fast: Excellent performance for typical workloads
- Zero Configuration: Works out of the box
[Database]
type = sqlite
[SQLite]
db_file = ./database/sync2nas.dbdatabase/ └── sync2nas.db # SQLite database file ├── tv_shows # Show information ├── episodes # Episode information ├── downloaded_files # Downloaded file tracking ├── sftp_temp_files # SFTP file listings └── anime_tv_inventory # Local file inventory ├── downloaded_files # Downloaded file tracking ├── sftp_temp_files # SFTP file listings └── anime_tv_inventory # Local file inventory
- Command Pattern: Each command is a separate module
- Dependency Injection: Services injected via context object
- Factory Pattern: Dynamic command discovery
@click.command("add-show")
@click.argument("show_name", required=False)
@click.option("--tmdb-id", type=int, help="TMDB ID")
@click.pass_context
def add_show(ctx, show_name, tmdb_id):
"""Add a show to the database."""
db = ctx.obj["db"]
tmdb = ctx.obj["tmdb"]
# Command implementationThe API layer provides REST endpoints for programmatic access.
api/
├── main.py # FastAPI application
├── dependencies.py # Dependency injection
├── models/ # Request/response models
│ ├── requests.py # Request schemas
│ └── responses.py # Response schemas
├── routes/ # API endpoints
│ ├── shows.py # Show management
│ ├── files.py # File operations
│ ├── remote.py # SFTP operations
│ └── admin.py # Admin operations
└── services/ # API-specific services
├── show_service.py # Show business logic
├── file_service.py # File business logic
└── admin_service.py # Admin operations
- REST API: Standard REST endpoints
- Dependency Injection: Services injected via FastAPI dependencies
- Pydantic Models: Request/response validation
- Service Layer: Business logic separation
@router.post("/shows/", response_model=AddShowResponse)
async def add_show(
request: AddShowRequest,
show_service: ShowService = Depends(get_show_service)
):
"""Add a new show."""
return await show_service.add_show(
show_name=request.show_name,
tmdb_id=request.tmdb_id
)The service layer contains core business logic and external integrations.
services/
├── db_factory.py # Database factory
├── tmdb_service.py # TMDB API integration
├── sftp_service.py # SFTP operations
├── llm_service.py # OpenAI integration
└── db_implementations/ # Database backends
├── db_interface.py # Abstract interface
├── sqlite_implementation.py
├── postgres_implementation.py
└── milvus_implementation.py
- Factory Pattern: Database backend selection
- Strategy Pattern: Different database implementations
- Adapter Pattern: External API integrations
- Service Pattern: Business logic encapsulation
class TMDBService:
def __init__(self, api_key: str):
self.api_key = api_key
def search_show(self, name: str) -> Dict[str, Any]:
"""Search for shows on TMDB."""
# Implementation
def get_show_details(self, tmdb_id: int) -> Dict[str, Any]:
"""Get detailed show information."""
# ImplementationThe model layer defines data structures and validation.
models/
├── __init__.py
├── show.py # Show data model
└── episode.py # Episode data model
- Data Classes: Clean data representation
- Validation: Input/output validation
- Serialization: Database/API serialization
@dataclass
class Show:
tmdb_id: int
tmdb_name: str
sys_name: str
sys_path: str
@classmethod
def from_tmdb(cls, details: Dict[str, Any], **kwargs) -> "Show":
"""Create Show from TMDB data."""
# Implementation
@classmethod
def from_db_record(cls, record: Dict[str, Any]) -> "Show":
"""Create Show from database record."""
# ImplementationThe utility layer provides helper functions and configuration management.
utils/
├── sync2nas_config.py # Configuration management
├── logging_config.py # Logging setup
├── file_routing.py # File routing logic
├── episode_updater.py # Episode update logic
├── show_adder.py # Show addition logic
├── sftp_orchestrator.py # SFTP orchestration
├── file_filters.py # File filtering
├── cli_helpers.py # CLI utilities
└── sftp_orchestrator.py # SFTP operations
- Utility Functions: Pure functions for specific tasks
- Configuration Pattern: Centralized configuration management
- Orchestration Pattern: Complex operation coordination
Used for database backend selection and service creation.
def create_db_service(config: Dict[str, Any]) -> DatabaseInterface:
db_type = config["Database"]["type"]
if db_type == "sqlite":
return SQLiteDBService(config["SQLite"]["db_file"])
elif db_type == "postgres":
return PostgresDBService(config["PostgreSQL"])
elif db_type == "milvus":
return MilvusDBService(config["Milvus"])
else:
raise ValueError(f"Unsupported database type: {db_type}")Used for different filename parsing strategies (regex vs LLM).
def parse_filename(filename: str, llm_service: Optional[LLMInterface] = None) -> dict:
if llm_service:
result = llm_service.parse_filename(filename)
if result.get("confidence", 0.0) >= 0.7:
return result
return _regex_parse_filename(filename)Used throughout the application for service injection.
@click.pass_context
def add_show(ctx, show_name, tmdb_id):
db: DatabaseInterface = ctx.obj["db"]
tmdb: TMDBService = ctx.obj["tmdb"]
# Use injected servicesUsed for CLI command organization.
@click.command("add-show")
def add_show():
"""Add a show to the database."""
# Command implementation
# Dynamic registration
sync2nas_cli.add_command(add_show)User Command → CLI → SFTP Service → Database → File System
↓ ↓ ↓ ↓
download-from-remote → list_remote_files → store_metadata → save_files
User Command → CLI → File Routing → Database → File System
↓ ↓ ↓ ↓
route-files → list_remote_files → store_metadata → save_files
-
Export Data
# Backup SQLite database python sync2nas.py backup-db -
Update Configuration
[Database] type = postgres [PostgreSQL] host = localhost port = 5432 database = sync2nas user = sync2nas_user password = your_password
-
Initialize New Database
python sync2nas.py init-db
-
Import Data (manual process required)
- Export data from SQLite
- Import to PostgreSQL
- Verify data integrity
-
Export Data
# Use PostgreSQL tools to export pg_dump sync2nas > sync2nas_backup.sql
-
Update Configuration
[Database] type = sqlite [SQLite] db_file = ./database/sync2nas.db
-
Initialize New Database
python sync2nas.py init-db
-
Import Data (manual process required)
# Automatic backup
python sync2nas.py backup-db
# Manual backup
cp ./database/sync2nas.db ./database/sync2nas_backup_$(date +%Y%m%d).db# Full backup
pg_dump sync2nas > sync2nas_backup_$(date +%Y%m%d).sql
# Compressed backup
pg_dump sync2nas | gzip > sync2nas_backup_$(date +%Y%m%d).sql.gz# Milvus backup (requires Milvus tools)
milvus backup --collection tv_shows --backup_path ./backup/# Enable WAL mode for better concurrency
PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
PRAGMA cache_size=10000;
PRAGMA temp_store=MEMORY;-- Create indexes for better performance
CREATE INDEX idx_tv_shows_tmdb_id ON tv_shows(tmdb_id);
CREATE INDEX idx_episodes_tmdb_id ON episodes(tmdb_id);
CREATE INDEX idx_episodes_absolute ON episodes(absolute_episode);
-- Analyze tables
ANALYZE tv_shows;
ANALYZE episodes;# Configure collection parameters
collection_params = {
"dimension": 384,
"index_file_size": 1024,
"metric_type": "L2"
}- SQLite: Check file permissions and disk space
- PostgreSQL: Verify server is running and credentials are correct
- Milvus: Check if Milvus server is accessible
- SQLite: Consider migrating to PostgreSQL for large datasets
- PostgreSQL: Add indexes and optimize queries
- Milvus: Adjust collection parameters and hardware resources
- SQLite: Use backup and restore
- PostgreSQL: Use pg_dump/pg_restore
- Milvus: Use Milvus backup tools
# Check database size
ls -lh ./database/sync2nas.db
# Check integrity
sqlite3 ./database/sync2nas.db "PRAGMA integrity_check;"-- Check table sizes
SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables WHERE schemaname = 'public';
-- Check connection count
SELECT count(*) FROM pg_stat_activity;# Check collection statistics
collection = Collection("tv_shows")
print(f"Entity count: {collection.num_entities}")
print(f"Index status: {collection.has_index()}")- You have a small to medium media library (< 10,000 episodes)
- You want simple setup and maintenance
- You don't need concurrent access
- You're just getting started
- You have a large media library (> 10,000 episodes)
- You need concurrent access from multiple processes
- You want enterprise-grade reliability
- You plan to scale significantly
- You want similarity search and recommendations
- You're experimenting with advanced features
- You have the resources to manage a vector database
- You're building a recommendation system
- Both database and LLM backends use a factory and interface/implementation pattern.
- The backend is selected via config, and the factory instantiates the correct implementation.
- This pattern makes it easy to add new database or LLM backends in the future.
- To add a new backend, implement the interface, add your class, and update the factory.
- The LLM system now supports both Ollama and OpenAI using this pattern.
- See
services/llm_factory.pyandservices/llm_implementations/for details.