A declarative workflow descriptor that separates what to deploy from where to deploy it, see our project page.
Describe your workflow once in a portable YAML -- tasks, dependencies, resources, and launch methods -- and sflow executes the DAG through swappable backends, leveraging each platform's native ecosystem. Write one sflow.yaml and run it across environments with minimal changes.
The current focus is Slurm, which lacks a built-in workflow orchestration layer. Docker and Kubernetes backends are planned.
| Feature | Description |
|---|---|
| Modular Composition | Split workflows into reusable YAML fragments, merge at runtime with sflow compose or multi-file sflow run -f |
| Topology-aware GPU Allocation | Automatic node/GPU placement with CUDA_VISIBLE_DEVICES slicing across tasks and replicas |
| Probes | Readiness and failure gates -- TCP port, HTTP, log watch with pattern matching |
| Replicas & Sweeps | Parallel/sequential replicas with Cartesian product variable sweeps |
| Batch Mode | Generate sbatch scripts, CSV-driven bulk sweeps, parallel preflight validation |
| Expressions | Jinja2 ${{ }} syntax for variables, backend info, and task metadata |
| Artifacts | Named URIs (fs://, file://, http://) with inline content generation |
| Live TUI | Rich terminal interface with task status, log tailing, and allocation maps |
| AI Agent Skills | Built-in skills that teach coding assistants (Cursor, Copilot) to write and debug sflow YAML |
| Preflight Validation | Container image checks, GPU oversubscription detection, dependency cycle analysis |
Modular workflow samples for LLM inference serving with NVIDIA Dynamo:
| Framework | Aggregated | Disaggregated (P/D) | Multi-Node |
|---|---|---|---|
| SGLang | Yes | Yes | Yes |
| vLLM | Yes | Yes | Yes |
| TRT-LLM | Yes | Yes | Yes |
All frameworks share a common infrastructure layer (etcd, NATS, frontend, nginx) -- only the server task files differ.
| Command | Purpose | Key Flags |
|---|---|---|
sflow run |
Execute a workflow | --dry-run --tui --set -f (multi-file) |
sflow batch |
Generate sbatch scripts | --submit --bulk-input --row |
sflow compose |
Merge multiple YAMLs | --resolve --missable-tasks -o |
sflow visualize |
Render DAG graph | --format png/svg/mermaid |
sflow sample |
List / copy examples | --list -o |
sflow skill |
Export AI agent skills | --list -o |
Full user documentation: https://nvidia.github.io/nv-sflow/
- Introduction -- concepts and architecture
- Quickstart -- local and Slurm setup
- Configuration -- full YAML schema
- Modular Workflows -- multi-file composition
- Quick Reference -- all fields at a glance
- CLI Reference -- commands and flags
- Sample Workflows -- production examples
Validate the workflow engine locally (no Slurm required):
uv venv
source .venv/bin/activate
uv pip install "sflow @ git+https://github.com/NVIDIA/nv-sflow.git@main"
sflow run --file examples/local_hello_world.yaml --tuiMinimal workflow:
version: "0.1"
variables:
WHO:
description: "who to greet"
value: Nvidia
workflow:
name: hello_local
tasks:
- name: hello
script:
- echo "Hello ${WHO}"Run a modular multi-file workflow on Slurm:
sflow run \
-f slurm_config.yaml -f common_workflow.yaml \
-f sglang/prefill.yaml -f sglang/decode.yaml -f benchmark_aiperf.yaml \
--missable-tasks agg_server --tuiExport AI agent skills for your IDE:
sflow skill -o .cursor/skills/-
Python 3.10 or higher
-
uv (Python package installer and resolver)
curl -LsSf https://astral.sh/uv/install.sh | sh
git clone https://github.com/NVIDIA/nv-sflow.git
cd nv-sflow
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
pytestPlease see CONTRIBUTING.md for details on how to contribute to this project.
This project is licensed under the Apache License 2.0.

