Skip to content

Conversation

@oleksandr-korin
Copy link
Contributor

@oleksandr-korin oleksandr-korin commented Jan 16, 2026

Note

Introduces example workflows and foundational docs for the process-driven platform.

  • Adds templates: content-review, customer-support, data-analysis with definition.yaml and template.yaml under config/process-templates/
  • Documents the platform: backlogs (MVP/Core/Advanced), index, development process, and architecture iterations (IT1IT4) in docs/PROCESS_DRIVEN_PLATFORM/
  • Updates docker-compose.yml to set backend TRINITY_DB_PATH=/data/trinity.db
  • Adjusts .gitignore to include src/backend/services/process_engine/repositories/

Written by Cursor Bugbot for commit aa3d56e. Configure here.

- Move PROCESS_DRIVEN_AGENTS.md to PROCESS_DRIVEN_PLATFORM/ folder
- Add IT1-IT4 thinking documents (analysis, architecture, DDD, UI/UX)
- Add phase-based backlog files (MVP, Core, Advanced)
- Add BACKLOG_INDEX.md with conventions and traceability
- Add DEVELOPMENT_PROCESS.md with workflow and testing strategy

This establishes the foundation for implementing the Process Engine feature.
- ProcessId: UUID wrapper with validation and generation
- ExecutionId: UUID wrapper for execution instances
- StepId: Validated string identifier (alphanumeric, hyphens, underscores)
- Version: Major.minor versioning with parsing and comparison
- Duration: Time duration with parsing ("30s", "5m", "2h", "1d")
- Money: Decimal-based currency representation

All value objects are frozen dataclasses for immutability.
Includes 67 unit tests covering all validation and operations.

Refs: IT3 Section 4.3 (Value Objects)
Domain Model:
- ProcessDefinition: Aggregate root with validation, publish/archive lifecycle
- StepDefinition: Entity for individual steps (agent_task, human_approval, gateway)
- StepExecution: Runtime state tracking for step instances
- OutputConfig: Process output configuration

Step Configurations:
- AgentTaskConfig: agent, message, timeout, model, temperature
- HumanApprovalConfig: title, description, assignees, timeout (stub)
- GatewayConfig: gateway_type, routes, default_route (stub)
- TimerConfig, NotificationConfig: stubs for Advanced phase

Validation:
- Duplicate step ID detection
- Invalid dependency reference checking
- Circular dependency detection via DFS
- Empty name/steps validation

Schema & Documentation:
- JSON Schema for editor validation (process-definition.schema.json)
- Example YAML files (content-pipeline.yaml, approval-workflow.yaml)

Includes 27 new tests (94 total process engine tests passing)

Refs: IT3 Section 4 (Aggregates)
Repository Interfaces:
- ProcessDefinitionRepository: Full CRUD + versioning + filtering
- ProcessExecutionRepository: Interface for execution state (impl later)

SQLite Implementation:
- SqliteProcessDefinitionRepository with JSON storage
- Indexed columns for name, status, and name+version uniqueness
- In-memory support for testing

CRUD Operations:
- save: Create or update (upsert by ID)
- get_by_id, get_by_name, get_latest_version
- list_all, list_by_name with filtering and pagination
- delete, exists, count

Version Tracking:
- Multiple versions per process name
- Version major.minor stored separately for efficient queries
- get_latest_version returns latest published

Also: Updated .gitignore to allow process engine repositories folder

Includes 22 new tests (116 total process engine tests passing)

Refs: IT3 Section 7 (Repositories)
ProcessValidator Service:
- validate_yaml(): Full validation pipeline from raw YAML
- validate_definition(): Validate existing ProcessDefinition

Validation Levels:
1. YAML syntax validation with line numbers
2. Schema validation (required fields, types)
3. Semantic validation (from domain layer):
   - No duplicate step IDs
   - All depends_on references exist
   - No circular dependencies
4. Agent existence checking (warnings only)

ValidationResult:
- Separates errors (blocking) from warnings (advisory)
- is_valid: True if no errors
- to_dict(): API-friendly response format

Error/Warning Details:
- message: Human-readable description
- level: error | warning
- path: JSON path (e.g., "steps[0].agent")
- line: YAML line number (when available)
- suggestion: Fix suggestion

Includes 26 new tests (142 total process engine tests passing)

Refs: IT3 Section 6 (Domain Services)
Domain Events:
- DomainEvent: Base class with kw_only timestamp for inheritance
- Process events: ProcessStarted, ProcessCompleted, ProcessFailed, ProcessCancelled
- Step events: StepStarted, StepCompleted, StepFailed, StepSkipped
- Approval events: ApprovalRequested, ApprovalDecided
- All events immutable (frozen dataclasses) with to_dict() serialization

Event Bus Infrastructure:
- EventBus: Abstract interface for publish/subscribe
- InMemoryEventBus: MVP implementation with async dispatch
- subscribe(): Handler for specific event types
- subscribe_all(): Global handler for all events
- unsubscribe(), clear(): Handler management
- Async non-blocking dispatch via asyncio.create_task()
- Error isolation: handler errors logged but don't stop other handlers
- wait_for_pending(): For testing and graceful shutdown

Includes 23 new tests (165 total process engine tests passing)

Refs: IT3 Section 5 (Domain Events)
Added comprehensive tests for execution-side domain model:
- ProcessExecution creation and initialization
- State transitions (start, complete, fail, cancel, pause, resume)
- Step operations (start_step, complete_step, fail_step)
- Query methods (get_completed_step_ids, all_steps_completed)
- Serialization roundtrip (to_dict, from_dict)
- StepExecution entity tests

Fixes gap identified during Sprint 1 review.
31 new tests (196 total process engine tests passing)
Implements REST API for process definitions:
- POST /api/processes - Create new process definition
- GET /api/processes - List all (with filters, pagination)
- GET /api/processes/{id} - Get single definition
- PUT /api/processes/{id} - Update draft definition
- DELETE /api/processes/{id} - Delete draft/archived
- POST /api/processes/{id}/validate - Validate existing
- POST /api/processes/validate - Validate YAML
- POST /api/processes/{id}/publish - Publish draft
- POST /api/processes/{id}/archive - Archive process
- POST /api/processes/{id}/new-version - Create new version

Also marks E2-01 (Execution State Model) as done - was
implemented in Sprint 1 with ProcessExecution aggregate.

21 new API tests (217 total process engine tests passing)
Implements SQLite repository for process executions:
- SqliteProcessExecutionRepository with save/get/delete/list operations
- Tables: process_executions, step_executions
- Proper serialization of Money (as cents), timestamps, JSON data
- Query methods: list_by_process, list_active, list_all with filters

Also adds input/cost fields to StepExecution entity for tracking
step-level inputs and costs (prepares for E2-06 output storage).

21 new repository tests (238 total process engine tests passing)
Implements centralized output storage management:
- OutputStorage service for store/retrieve/delete operations
- OutputPath value object with path pattern /executions/{id}/steps/{step}/output
- Unified API for accessing step outputs regardless of backend
- get_all_outputs() for bulk retrieval
- clear_execution_outputs() for cleanup
- Handles empty dict vs None distinction properly

23 new output storage tests (261 total process engine tests passing)

Sprint 2 complete:
- E1-04: Process Definition API ✓
- E2-01: Execution State Model ✓
- E2-02: Execution Repository ✓
- E2-06: Step Output Storage ✓
Implements the core execution engine for process orchestration:

ExecutionEngine:
- start() - Start new execution from definition
- resume() - Resume paused/partial execution
- cancel() - Cancel running execution
- Timeout handling per step with asyncio.wait_for
- Domain event emission (ProcessStarted, StepCompleted, etc.)

DependencyResolver:
- get_ready_steps() - Find steps with satisfied dependencies
- get_next_step() - Get next step for sequential execution
- get_execution_order() - Topological sort of all steps
- is_complete() / has_failed_steps() - Status checks

StepHandler Interface:
- Abstract base for step type handlers (AgentTask, etc.)
- StepHandlerRegistry for handler lookup
- StepContext for passing execution state
- StepResult for success/failure responses

16 new tests (277 total process engine tests passing)
Implements handler for agent_task step type:

AgentTaskHandler:
- Executes agent_task steps by sending messages to Trinity agents
- Variable substitution for {{input.X}} and {{steps.X.output}}
- Timeout handling from step config
- Cost extraction from agent response

AgentGateway (Anti-Corruption Layer):
- Wraps Trinity's AgentClient for clean process engine integration
- Agent availability checking via Docker container status
- Message sending with context metadata
- Error handling and wrapping

10 new tests (287 total process engine tests passing)
Implements template expression evaluation for process messages:

ExpressionEvaluator:
- Evaluates {{expression}} placeholders in strings
- Supports input.X, input.X.Y for nested input data
- Supports steps.X.output, steps.X.output.Y for step outputs
- Supports execution.id, process.name context
- Strict mode raises ExpressionError for undefined expressions
- Expression extraction and validation utilities

EvaluationContext:
- Path-based access to input_data, step_outputs, metadata
- Handles nested dict navigation

AgentTaskHandler Integration:
- Now uses ExpressionEvaluator for variable substitution
- Supports all expression types in message and agent name

30 new tests (317 total process engine tests passing)
Implements REST API for managing process executions:

Endpoints:
- POST /api/processes/{id}/execute - Start new execution
- GET /api/executions - List executions with filters
- GET /api/executions/{id} - Get execution detail
- POST /api/executions/{id}/cancel - Cancel running execution
- POST /api/executions/{id}/retry - Retry failed execution
- GET /api/executions/{id}/steps/{step_id}/output - Get step output

Features:
- Background task execution via BackgroundTasks
- Status filtering (pending, running, completed, failed)
- Process ID filtering
- Pagination support (limit/offset)
- Full step execution details in response
- Authentication required for all endpoints
- Auto-generated OpenAPI docs

13 new tests (330 total process engine tests passing)
Sprint 4: Definition UI (Frontend)

E3-01: Process List View
- Card grid showing all processes with status badges
- Sorting: newest, oldest, name, status
- Filtering by status (draft/published/archived)
- Actions: Execute, Edit, Delete
- Empty state with Create CTA

E3-02: YAML Editor Component
- Monaco editor integration with dynamic import
- YAML syntax highlighting
- Line numbers and word wrap
- Dark mode theme support
- Inline validation error markers
- Copy to clipboard, Cmd+S to save

E3-04: Process Editor Page
- Full editor with validation panel
- Save/Publish/Execute buttons
- Unsaved changes warning on navigation
- Default YAML template for new processes
- Keyboard shortcut (Cmd+S)

Also:
- Added processes Pinia store
- Added routes for /processes, /processes/new, /processes/:id
- Added Processes link to NavBar
- Added monaco-editor dependency
Sprint 5: Monitoring UI (Frontend)

E4-01: Execution List View
- Table with status icons (✅ ❌ 🔄 ⏳)
- Columns: Process Name, Status, Started, Duration, Cost
- Filters by status and process
- Auto-refresh (30s polling) with pause/resume
- Pagination support
- Cancel/Retry actions

E4-02: Execution Timeline View
- Step-by-step progress display
- Progress bar with completion percentage
- Duration bars relative to longest step
- Status indicators (running animation)
- Click to expand step details

E4-03: Step Detail Panel
- Expandable within timeline
- Timing info: started, completed, duration, retries
- Error display with copy button
- Output loading on demand
- Copy output to clipboard

E4-05: Process Dashboard
- Overall stats: processes, executions, success rate, cost
- Recent executions list
- Published processes with quick execute
- Quick action buttons

Also:
- Added executions Pinia store
- Added routes for /executions, /executions/:id, /process-dashboard
- Added Executions link to NavBar
Fixes:
- Updated deploy scripts to use `docker compose` (V2) instead of legacy `docker-compose`
- Fixed frontend build error by adding `js-yaml` dependency
- Fixed backend crash in `executions.py` due to double `Depends` injection
- Fixed `sqlite3.OperationalError` in `processes.py` by ensuring DB directory exists
- Fixed 404 execution error by aligning DB paths between processes and executions routers
- Fixed `AttributeError` in `processes.py` by using `.major` for version logging
- Fixed UI "Delete" and "Unsaved Changes" modals by passing `:visible="true"` to ConfirmDialog
- Added `TRINITY_DB_PATH` env var to backend service for correct persistence

Features:
- Implemented `EventLogger` service (E15-04) to persist domain events
- Added `SqliteEventRepository` implementation
- Added `Event History` tab to Execution Detail UI
- Completed Process Dashboard implementation (E4-05)

Refs: BACKLOG_MVP.md
…or debugging UI

Sprint 7 implements comprehensive error handling for process execution:

- Add RetryPolicy value object with configurable max_attempts, initial_delay, backoff_multiplier
- Add ErrorPolicy value object with on_error actions (fail_process, skip_step)
- Implement retry logic in ExecutionEngine with exponential backoff
- Fix error code propagation through fail_step chain
- Update API schema to return full error objects (code, message, retry_count)
- Enhance ExecutionTimeline UI to display error details, retry attempts
- Fix WebSocket handler to properly construct error objects from events
- Update dependency resolver to handle skipped steps correctly
- Add comprehensive unit tests for error handling scenarios

Stories completed: E13-01, E13-02, E13-04
Sprint 8 implements parallel step execution for faster process completion:

- Add ParallelGroup and ParallelStructure classes for parallel analysis
- DependencyResolver.get_parallel_structure() identifies parallelizable steps
- ExecutionEngine runs independent steps concurrently with asyncio.gather()
- Add ExecutionConfig.parallel_execution and max_concurrent_steps options
- API returns parallel_level for each step and has_parallel_steps flag
- UI shows parallel indicator (⫘) for steps at same execution level
- Add comprehensive unit tests for parallel detection and execution

Stories completed: E5-01, E5-02, E5-03
- ProcessFlowPreview: Vertical orientation by default with swimlane layout
- ProcessFlowPreview: Parallel steps grouped horizontally with dashed border
- ProcessFlowPreview: Compact design to fit without scrolling
- ExecutionTimeline: Parallel indicator (⫘) and level-based sorting
…ox UI

Sprint 9 - Human Approval:
- Add human_approval step type that pauses execution for review
- Add ApprovalRequest entity and ApprovalStatus enum
- Add approval API endpoints (list, approve, reject)
- Add Approval Inbox UI page with filtering and stats
- Add inline approve/reject buttons in execution timeline
- Fix WebSocket event serialization for StepId objects
- Add non-retryable error handling for APPROVAL_REJECTED
- Add execution resume after approval decision
Sprint 10: Gateways & Triggers implementation

Gateway Step (E7-01, E7-02, E7-03):
- Add gateway step type with exclusive/parallel/inclusive modes
- Implement ConditionEvaluator for boolean expressions (==, !=, >, <, and, or)
- GatewayHandler evaluates conditions and selects routes
- Gateway UI shows route taken and evaluated conditions
- Fix _build_step_outputs for proper condition evaluation

Webhook Triggers (E8-01, E8-02):
- Add TriggerConfig schema (WebhookTriggerConfig, ScheduleTriggerConfig stub)
- Implement /api/triggers endpoints (list, invoke, info)
- Triggers stored in ProcessDefinition and serialized correctly
- Optional secret authentication via X-Webhook-Secret header

UI Improvements:
- Execute input dialog for providing JSON input data
- ProcessList shows Eye icon for published (view) vs Pencil for draft (edit)
- ProcessEditor guards against editing published processes
- "New Version" button for creating drafts from published processes
- Add notification step type (slack, email stub, generic webhook)
- Add webhook event publisher for process_completed/failed/approval_requested
- Add compensation handlers for step rollback on process failure
- Add CompensationConfig value object for type-safe compensation
- Add compensation domain events (Started, Completed, Failed)
- Add trigger management UI with webhook URL copy functionality
- Add execute dialog for JSON input when starting processes

Reference: BACKLOG_MVP.md - E14-01, E14-02, E15-03, E13-03, E8-03
Bugs discovered during manual testing:

1. Notification YAML parsing: Added missing NOTIFICATION branch in
   StepDefinition.from_dict() - inline fields (channel, message, url)
   were not being extracted, causing all notifications to default to
   slack channel.

2. Compensation handler registry: Fixed get_handler() → get() method
   call on StepHandlerRegistry.

3. Compensation webhook URL: Added 'url' field mapping for generic
   webhook channel (was only setting webhook_url for Slack).

Test: compensation-test.yaml triggers intentional failure to verify
compensation handlers execute correctly.
UI/UX improvements for Sprint 11:

1. Retry Execution Tracking:
   - Add retry_of field to ProcessExecution aggregate
   - Link retry executions to original failed execution
   - Display amber "Retry of: <id>" badge with link to original

2. Real-time Compensation Events via WebSocket:
   - Add CompensationStarted/Completed/Failed to WebSocket publisher
   - Add compensation event handlers in useProcessWebSocket composable
   - Display compensation events in Event History without refresh

3. Event Display Improvements:
   - Format both snake_case and PascalCase event types
   - Clear events when navigating to different execution
   - Add 500ms delay before reload to ensure events are persisted
   - Style compensation events with appropriate colors
Sprint 12 Implementation (E9 - Timer & Scheduling):
- Add cron presets (hourly, daily, weekly, monthly, weekdays)
- Add cron expression validation to ProcessValidator
- Extend scheduler service with process schedule support
- Hook schedule registration into publish/archive lifecycle
- Add schedule trigger list/info API endpoints
- Create TimerHandler for timer step type
- Add schedule trigger UI in ProcessEditor
- Display next run time in ProcessList

Test Debt Fix (Sprints 9-11):
- Add test_approval_handler.py (24 tests) - S9
- Add test_gateway_handler.py (18 tests) - S10
- Add test_webhook_triggers.py (14 tests) - S10
- Add test_notification_handler.py (20 tests) - S11
- Add test_compensation.py (22 tests) - S11
- Add test_schedule_triggers.py (23 tests) - S12
- Add test_timer_handler.py (11 tests) - S12

Bug fixes:
- Fix timer.py: context.step -> context.step_definition
- Remove unused import in timer.py

Total: 466 tests passing (was 374)
Sprint 14 Implementation:
- E11-02: Process Analytics Dashboard with metrics, trends, step performance
- E11-03: Cost Alerts system with thresholds (per-execution, daily, weekly)
- E12-01: Process Template Library with bundled templates
- E12-02: Template creation from published processes

UI Improvements:
- Add Process sub-navigation (Processes, Dashboard, Executions, Approvals)
- Add Agent sub-navigation (Agents, Files, Templates)
- Declutter main nav from 8 to 6 items
- Template selector in process creation flow

Backend:
- Analytics service with ProcessMetrics, TrendData, StepPerformance
- CostAlertService with threshold management and alert generation
- ProcessTemplateService for bundled and user templates
- New API endpoints for analytics, alerts, and templates

Frontend:
- TrendChart component for execution/cost visualization
- TemplateSelector component for process templates
- ProcessSubNav and AgentSubNav components
- Alerts page and NavBar badge integration
- Enhanced ProcessDashboard with analytics

Tests:
- 59 new unit tests for analytics, alerts, and templates
Sprint 15 - Agent Roles (EMI Pattern):
- Add AgentRole enum (Executor/Monitor/Informed)
- Add StepRoles entity with validation
- Create InformedAgentNotifier service for async notifications
- Add RoleMatrix.vue component for interactive role management
- Integrate roles tab in ProcessEditor with UI-YAML sync
- Extend ProcessValidator for role validation across all step types

Sprint 12-14 Completion:
- E9-01: Cron validation already implemented (verified)
- E10-02: Add breadcrumb navigation for nested sub-process executions
- E11-01: Cost tracking already implemented (verified)
- E11-03: Integrate CostAlertService with ExecutionEngine
- E12-01: Add customer-support bundled process template

Tests:
- Add test_roles.py for EMI pattern validation
- Add test_informed_notifier.py for notification service
- Add ExecutionEngine cost alert integration tests

Process engine adds ~27,400 LOC (~41% growth to Trinity codebase)
- Move PROCESS_DRIVEN_PLATFORM from docs/drafts/ to docs/
- Add IT5 thinking iteration (scale, reliability, enterprise)
- Add Process Engine roadmap with 14 test processes across 5 phases
- Create feature flow documentation for Process Engine:
  - README overview, process-definition, process-execution
  - process-monitoring, human-approval, process-scheduling
  - process-analytics, sub-processes, agent-roles-emi, process-templates
- Update requirements.md: Add Section 18 (Process Engine) with all features
- Update roadmap.md: Mark Phase 14 complete, add decision log entry
- Update architecture.md: Add Process Engine section, API endpoints, DB schema
- Update feature-flows.md index with Process Engine section
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

…dencies

Fix workflow convergence issue where customers requiring escalation would
never receive a response. The send-response step now depends on both
check-escalation and escalation-approval, ensuring it runs after approval
completes on the escalation path while still working on the non-escalation
path where escalation-approval is skipped.
The sla-timer step was disconnected from the workflow - it had no
depends_on and no downstream steps depended on it, making the 30-minute
timer produce no observable effect.

Added sla-warning notification step that depends on sla-timer, so when
the timer expires, a Slack warning is sent alerting the team about the
approaching SLA breach with relevant ticket details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants