An extremely fast Python linter for Apache Airflow DAG files, written in Python.
DagRuff is a linter designed to catch common errors and enforce best practices in Apache Airflow DAG files. It checks for over 31 rules covering DAG structure, best practices, and Airflow-specific patterns.
- Fast: Built with performance in mind, using AST parsing for static analysis
- Caching: Results are cached based on file hash for improved performance
- Comprehensive: 31+ lint rules covering DAG structure, best practices, and Airflow patterns
- Auto-fix: Automatically fix many common issues with
--fix - Configurable: Configure rules via
pyproject.tomlor.dagruff.tomlwith validation - Plugin Support: Extend functionality with custom rule plugins via entry points
- No Airflow Required: Works without Airflow for AST-based checks (optional DagBag validation requires Airflow)
# Basic installation (no Airflow, AST checks only)
pip install dagruff
# With Airflow support (recommended for full DagBag validation)
pip install dagruff[airflow]Or install from source:
git clone https://github.com/dkfancska/dagruff.git
cd dagruff
pip install -e ".[airflow]"Note: Basic installation works without Airflow and performs all static checks via AST. For DagBag validation (import checking and code execution), install with the airflow extra.
After installation, use the dagruff command:
# Check a single file
dagruff examples/example_dag_good.py
# Check a directory
dagruff examples/
# Filter by severity
dagruff examples/ --severity warning
# JSON output
dagruff examples/ --format json
# Use configuration file
dagruff --config .dagruff.toml
# Without path - uses paths from config
dagruff
# Auto-fix all fixable issues
dagruff examples/ --fix
# Auto-fix specific rules
dagruff examples/ --fix DAG001 DAG009 AIR003
# Ignore specific rules
dagruff examples/ --ignore DAG006 DAG007
# Disable caching (useful for CI/CD)
dagruff examples/ --no-cache
# Verbose logging
dagruff examples/ --log-level debugDagRuff implements 31 lint rules from various sources:
- DAG import and definition checks
dag_idvalidation and uniqueness- Required DAG parameters (
dag_id,start_date) - Recommended parameters (
dag_md) - Special checks for
KubernetesPodOperator(requirescontainer_resourcesandexecutor_resources)
AIR002: Check forstart_datepresenceAIR003: CheckcatchupparameterAIR013: Recommendmax_active_runsAIR014: Recommendmax_active_tasksfor Airflow 2+ (warn about deprecatedconcurrency)
AF001: ForbidSubDagOperatorusageAF002: Security warnings forBashOperatorAF003: Checktask_iduniquenessAF004: Detect deprecated operators
AIRFLINT001: Check task dependenciesAIRFLINT002: Check XCom usageAIRFLINT003: Check Variables usageAIRFLINT004: Check required operator parameters
BP001: Check for top-level code avoidanceBP002: Check datetime function usageBP003: Recommendexecution_timeoutfor tasksBP004: Check dependency method consistencyBP005: Recommend docstrings for tasksBP006: Recommenddagrun_timeoutfor DAGs
Full documentation:
- π RULES.md - Complete rule descriptions with examples, quick reference, and grouping
- π PLUGINS.md - Plugin system documentation
- π§ CONTRIBUTING.md - Contribution guidelines
- β PRE_COMMIT.md - Pre-commit hooks setup
DagRuff supports automatic fixing of many issues via the --fix flag:
- DAG001 - Adds
from airflow import DAGimport - DAG005 - Removes extra spaces in
dag_id - DAG009 - Adds
"owner": "airflow"todefault_args - DAG010 - Adds
"retries": 1todefault_args - AIR003 - Adds
catchup=Falseto DAG - AIR013 - Adds
max_active_runs=1to DAG - AIR014 - Replaces
concurrencywithmax_active_tasksor addsmax_active_tasks=1
# Fix all fixable issues
dagruff examples/ --fix
# Fix only specific rules
dagruff examples/ --fix DAG001 DAG009
# Combine with other options
dagruff examples/ --fix DAG001 --severity warningNote: Auto-fix preserves code formatting and checks for duplicates before adding parameters. Uses AST-based approach for more reliable fixes with fallback to regex when needed.
DagRuff can be configured via pyproject.toml or .dagruff.toml:
[tool.dagruff]
# Enable/disable specific rules
select = ["DAG001", "DAG002", "AIR003"]
ignore = ["DAG006", "BP005"]
# Set minimum severity level
severity = "error" # or "warning", "info"
# Paths to check (automatically validated)
paths = ["dags/", "custom_dags/"]
# Per-file ignores
[tool.dagruff.per-file-ignores]
"legacy_dags/*.py" = ["DAG006", "DAG007"]Configuration Validation: DagRuff validates configuration values:
- Ensures
pathsandignoreare lists of strings - Validates rule ID format (e.g., DAG001, AIR002)
- Normalizes whitespace and filters empty values
- Gracefully handles invalid values with warnings
Caching: Results are cached by default based on file hash. Use --no-cache to disable:
- Automatic cache invalidation on file changes
- Memory-efficient singleton cache
- Deep copy returns for safety
The examples/ directory contains:
example_dag_good.py- Example of a correct DAGexample_dag_bad.py- Example DAG with errors to demonstrate the linter
DagRuff supports custom rule plugins via Python entry points. See PLUGINS.md for detailed documentation.
Quick Example:
# my_plugin/__init__.py
from typing import List
from dagruff.rules.ast_collector import ASTCollector
from dagruff.models import LintIssue, Severity
def check_all_custom_rules(collector: ASTCollector, file_path: str) -> List[LintIssue]:
"""Custom rule checker following RuleChecker protocol."""
issues = []
# Your custom logic here
return issues# pyproject.toml
[project.entry-points."dagruff.rules"]
my_custom_rule = "my_plugin:check_all_custom_rules"Contributions are welcome and highly appreciated! To get started:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Ensure tests pass (
pytest tests/) - 296+ tests with 77% code coverage - Ensure code is formatted (
ruff format) and linted (ruff check) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
Pre-commit Hooks: Tests run automatically before each commit. See PRE_COMMIT.md for setup.
# Clone the repository
git clone https://github.com/dkfancska/dagruff.git
cd dagruff
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode (with Airflow for full functionality)
pip install -e ".[airflow,dev]"
# or using uv
uv pip install -e ".[airflow,dev]"
# Run tests (296+ tests)
pytest tests/
# Run tests with coverage (current coverage: 77%)
pytest --cov=dagruff tests/
# Format code
ruff format dagruff tests/
# Lint code
ruff check dagruff tests/
# Run specific test file
pytest tests/test_linter.py -vdagruff/
βββ dagruff/ # Main package
β βββ __init__.py
β βββ cli/ # CLI package (refactored)
β β βββ __init__.py # Main entry point
β β βββ runner.py # CLI orchestrator
β β βββ linter.py # Linting functions
β β βββ commands/ # Command pattern
β β β βββ base.py # BaseCommand
β β β βββ check.py # CheckCommand
β β β βββ fix.py # FixCommand
β β βββ formatters/ # Output formatters
β β β βββ human.py # Human-readable format
β β β βββ json.py # JSON format
β β βββ utils/ # CLI utilities
β β βββ args.py # Argument parsing
β β βββ files.py # File utilities
β β βββ config_handler.py
β β βββ autofix_handler.py
β βββ config.py # Configuration handling with validation
β βββ linter.py # Main linter with caching
β βββ cache.py # Caching implementation
β βββ models.py # Data models
β βββ autofix.py # Auto-fix implementation
β βββ plugins.py # Plugin system
β βββ validation.py # Input validation
β βββ logger.py # Logging setup
β βββ rules/ # Lint rules
β βββ base.py # Protocols (RuleChecker, Linter, Autofixer)
β βββ ast_collector.py # AST data collector
β βββ dag_rules.py # DAG-specific rules
β βββ ruff_air_rules.py
β βββ best_practices_rules.py
β βββ airflint_rules.py
β βββ utils.py # Rule utilities
βββ tests/ # Tests (296+ tests)
βββ examples/ # Example DAG files
βββ pyproject.toml # Project configuration
βββ README.md # This file
βββ RULES.md # Rule descriptions
This project is licensed under the MIT License - see the LICENSE file for details.
DagRuff draws inspiration from:
- Ruff - For project structure and design philosophy
- flake8-airflow - For Airflow-specific rules
- airflint - For AST-based linting approaches
- Astronomer Guides - For best practices
Special thanks to the Apache Airflow community for their excellent documentation and tooling.
Having trouble? Check out the existing Issues or feel free to open a new one.