A fully refactored, modernized, and maintainable version of the SciCumulus Scientific Workflow Management System
This repository contains a complete architectural refactoring of the original SciCumulus project developed at UFF.
The refactoring preserves the conceptual goals and execution semantics of SciCumulus while radically improving:
- Maintainability
- Testability
- Modularity
- Extensibility
- Cloud and database portability
The result is a clean-architecture implementation suitable for research, teaching, and production environments.
The refactor was driven by the following objectives:
- Eliminate tight coupling between workflow logic, database, and infrastructure
- Replace legacy and script-style code with modern Java architecture
- Enable unit and integration testing
- Support cloud and local execution transparently
- Provide clear documentation and architectural artifacts
- Prepare the codebase for long-term evolution
The project follows Clean Architecture principles:
βββββββββββββββββββββββ
β API β CLI, user input
βββββββββββ²ββββββββββββ
β
βββββββββββ΄ββββββββββββ
β Core β Workflow, execution, scheduling
βββββββββββ²ββββββββββββ
β
βββββββββββ΄ββββββββββββ
β Infrastructure β Database, storage, filesystem
βββββββββββββββββββββββ
- Core logic has no dependency on database, cloud, or filesystem
- Infrastructure components are replaceable
- All dependencies point inward
src/
βββ main/java/br/uff/scicumulus
β βββ api/cli # CLI entry point
β βββ core # Workflow engine (pure logic)
β β βββ workflow
β β βββ execution
β β βββ scheduling
β β βββ provenance
β βββ infrastructure
β βββ database
β βββ storage
β βββ config
βββ test/java # Unit and integration tests
docs/
βββ architecture.puml # UML architecture diagram
βββ refactoring-report.md # Formal refactoring report
The project includes:
- JUnit 5 unit tests
- Testable core modules with no infrastructure dependencies
- In-memory provenance repository for safe integration testing
Run tests with:
mvn test- Provenance handling is abstracted via a repository interface
- Current implementation:
InMemoryProvenanceRepository - Easily extendable to PostgreSQL, MySQL, SQLite, or cloud databases
Storage and execution are abstracted via interfaces:
StorageServiceLocalStorageService(default)
This enables transparent integration with cloud providers and distributed filesystems.
Included in /docs:
- PlantUML architecture diagram
- Formal refactoring report suitable for academic publication
- Java 11+
- Maven 3.8+
mvn clean packagejava -cp target/scicumulus-2.0.0.jar br.uff.scicumulus.api.cli.MainThis repository is not a line-by-line rewrite.
It is a structural refactoring that preserves domain concepts while modernizing architecture.
This project follows the same license terms as the original SciCumulus project unless otherwise specified.
Contributions are welcome for:
- Database backends
- Cloud execution strategies
- Workflow parsers
- Performance evaluation
For questions or collaboration, please open an issue.
β
Status: Refactoring complete
π§ͺ Testable: Yes
βοΈ Cloud-ready: Yes
π Publication-ready: Yes