Skip to content

Conversation

@keshxvdayal
Copy link

@keshxvdayal keshxvdayal commented Jan 24, 2026

Detailed Implementation Plan

  1. Create OpenAPI Specification Document

    • Create data/specs/middleware_management.yaml
    • Define base paths and versioning (/ga4gh/tes/v1/middlewares)
    • Document all endpoints:
      • GET /middlewares - List all configured middlewares
      • POST /middlewares - Add new middleware (local or from GitHub)
      • GET /middlewares/{middleware_id} - Get middleware details
      • PUT /middlewares/{middleware_id} - Update middleware configuration
      • DELETE /middlewares/{middleware_id} - Remove middleware
      • PUT /middlewares/reorder - Reorder middleware stack
      • POST /middlewares/validate - Validate middleware code without adding
    • Define comprehensive schemas:
      • MiddlewareConfig - Full middleware configuration object
      • MiddlewareCreate - Request body for creating middleware
      • MiddlewareUpdate - Request body for updating middleware
      • MiddlewareList - Array response with pagination support
      • MiddlewareOrder - Order configuration for reordering
      • ValidationRequest - Code validation request
      • ErrorResponse - Standard error format
    • Document query parameters (pagination, filtering, sorting)
    • Define all possible HTTP status codes and error responses
  2. Integrate with FOCA Configuration

    • Update pro_tes/config.yaml to reference the new spec
    • Add middleware API paths to FOCA routing configuration
    • Configure CORS and authentication settings (if applicable)
    • Set up request/response validation middleware via FOCA
  3. Write API Documentation

    • Create docs/api/middleware_management.md
    • Provide usage examples with curl commands
    • Document authentication requirements
    • Explain order/priority system for middleware execution
    • Document fallback group behavior
    • Add sequence diagrams for complex operations (reordering, fallback)
  4. Validation and Review

    • Validate OpenAPI spec with openapi-generator validate
    • Generate API documentation preview with Redoc/Swagger UI
    • Create examples for each endpoint operation
    • Peer review with team focusing on:
      • API design best practices
      • Consistency with existing proTES APIs
      • GA4GH TES specification alignment

Summary by Sourcery

Add an OpenAPI specification and configuration wiring for a new runtime middleware management API in proTES.

New Features:

  • Introduce a Middleware Management REST API specification with endpoints for listing, creating, updating, deleting, reordering, and validating middlewares.

Enhancements:

  • Extend FOCA configuration to load the new middleware management OpenAPI spec with strict request/response validation and routing to middleware controllers.
  • Add a MongoDB index on middleware name to support efficient lookup and management operations.

Documentation:

  • Add middleware management architecture and API design documentation describing endpoints, data models, design decisions, and future work.

Copilot AI review requested due to automatic review settings January 24, 2026 17:38
@sourcery-ai
Copy link

sourcery-ai bot commented Jan 24, 2026

Reviewer's Guide

Adds a new OpenAPI 3.0 specification for runtime middleware management and wires it into FOCA/Connexion configuration and project docs, without yet implementing controller logic.

Sequence diagram for adding a new middleware via the OpenAPI spec

sequenceDiagram
  actor Admin
  participant Client as API_client
  participant Connexion as FOCA_Connexion
  participant Controller as MiddlewaresController
  participant DB as MongoDB_middlewares

  Admin->>Client: Prepare MiddlewareCreate_payload
  Client->>Connexion: POST /ga4gh/tes/v1/middlewares
  Connexion->>Connexion: Validate_request_against_OpenAPI
  Connexion->>Controller: addMiddleware(MiddlewareCreate)

  Controller->>DB: Query existing_middlewares_for_order
  DB-->>Controller: Existing_middlewares_with_orders
  Controller->>Controller: Compute_final_order_and_shift_stack
  Controller->>DB: Insert new_middleware_document
  DB-->>Controller: Insert_result_with_id

  Controller-->>Connexion: 201 MiddlewareCreateResponse
  Connexion-->>Client: HTTP_201_with_JSON_body
  Client-->>Admin: Show_created_middleware_id_and_order
Loading

Class diagram for middleware management OpenAPI schemas

classDiagram
  class MiddlewareConfig {
    string _id
    string name
    string class_path_string
    string[] class_path_group
    int order
    string source
    string github_url
    object config
    boolean enabled
    string created_at
    string updated_at
  }

  class MiddlewareCreate {
    string name
    string class_path_string
    string[] class_path_group
    int order
    string github_url
    object config
    boolean enabled
  }

  class MiddlewareUpdate {
    string name
    int order
    object config
    boolean enabled
  }

  class MiddlewareList {
    MiddlewareConfig[] middlewares
    int total
    int limit
    int offset
  }

  class MiddlewareCreateResponse {
    string _id
    int order
    string message
  }

  class MiddlewareOrder {
    string[] ordered_ids
  }

  class ValidationRequest {
    string class_path
    string github_url
    string code
  }

  class ValidationErrorItem {
    int line
    int column
    string message
    string severity
  }

  class ValidationWarningItem {
    int line
    string message
    string severity
  }

  class ValidationResponse {
    boolean valid
    string message
    ValidationErrorItem[] errors
    ValidationWarningItem[] warnings
  }

  class ErrorResponse {
    string error
    string message
    object details
    string timestamp
  }

  MiddlewareList "1" --> "*" MiddlewareConfig : contains
  MiddlewareCreateResponse --> MiddlewareConfig : _id_and_order_align_with
  MiddlewareOrder "1" --> "*" MiddlewareConfig : orders
  ValidationResponse "1" --> "*" ValidationErrorItem : has
  ValidationResponse "1" --> "*" ValidationWarningItem : has
Loading

File-Level Changes

Change Details Files
Integrate middleware management OpenAPI spec into FOCA API configuration and database indexes.
  • Add MongoDB index configuration for middlewares collection keyed by name.
  • Register new middleware_management OpenAPI spec in FOCA specs list with strict request/response validation enabled.
  • Configure routing to pro_tes.api.middlewares.controllers and temporarily disable authentication for these endpoints.
  • Tidy custom middleware list config (no behavioral change).
pro_tes/config.yaml
Introduce Middleware Management OpenAPI 3.0 specification describing middleware lifecycle operations and data models.
  • Define seven middleware management endpoints for listing, creating, retrieving, updating, deleting, reordering, and validating middlewares under /ga4gh/tes/v1/middlewares.
  • Add query parameters for pagination, sorting, and filtering on list endpoint and force flag on delete.
  • Specify request/response schemas for middleware CRUD, ordering, validation, and error handling, including MongoDB ObjectId patterns and GitHub URL constraints.
  • Model middleware configuration with support for single class paths and fallback groups, tracking source (local/github), timestamps, and soft-delete semantics.
pro_tes/api/middleware_management.yaml
Add high-level documentation for the Middleware Management API design and its role in the project.
  • Document the purpose and background of the middleware management feature and its design-first approach.
  • Summarize the defined endpoints, schemas, and key design decisions (ordering, soft delete, immutable class paths, GitHub integration).
  • Describe integration with FOCA/Connexion, file layout, validation strategy, and planned future subtasks for implementation and security.
  • Clarify that this PR is non-breaking and currently limited to specification and documentation work.
docs/middleware.md

Possibly linked issues

  • #N/A: PR creates the full middleware_management OpenAPI spec, schemas, endpoints, and docs exactly as the issue requests.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The MiddlewareCreateResponse schema only exposes _id, order, and message, but the docs in docs/middleware.md state that it returns the full middleware object; consider either expanding the response schema (e.g., embed MiddlewareConfig) or updating the docs to reflect the actual payload to avoid confusion for API consumers.
  • In ValidationRequest, the class_path field is described as "Class path or code to validate" while a separate code field is also defined; it would be clearer to tighten the descriptions and possibly add explicit rules (e.g., mutual exclusivity or precedence) so clients know exactly how these fields should be used together.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `MiddlewareCreateResponse` schema only exposes `_id`, `order`, and `message`, but the docs in `docs/middleware.md` state that it returns the full middleware object; consider either expanding the response schema (e.g., embed `MiddlewareConfig`) or updating the docs to reflect the actual payload to avoid confusion for API consumers.
- In `ValidationRequest`, the `class_path` field is described as "Class path or code to validate" while a separate `code` field is also defined; it would be clearer to tighten the descriptions and possibly add explicit rules (e.g., mutual exclusivity or precedence) so clients know exactly how these fields should be used together.

## Individual Comments

### Comment 1
<location> `pro_tes/config.yaml:82` </location>
<code_context>
+      - api/middleware_management.yaml
+    add_operation_fields:
+      x-openapi-router-controller: pro_tes.api.middlewares.controllers
+    disable_auth: True
+    connexion:
+      strict_validation: True
</code_context>

<issue_to_address>
**🚨 issue (security):** Exposing middleware management without auth looks risky for production use.

Since this endpoint can change the middleware stack at runtime, keeping `disable_auth: True` lets anyone with network access reorder/enable/disable/add middlewares. Unless this is strictly limited to a trusted/debug environment, this should require authz (or be configurable per deployment) to avoid a straightforward privilege escalation path.
</issue_to_address>

### Comment 2
<location> `pro_tes/api/middleware_management.yaml:572-358` </location>
<code_context>
+            - "507f1f77bcf86cd799439012"
+            - "507f1f77bcf86cd799439013"
+
+    ValidationRequest:
+      type: object
+      description: Request body for validating middleware code
+      required:
+        - class_path
+      properties:
</code_context>

<issue_to_address>
**issue (bug_risk):** ValidationRequest requires class_path even when validating raw code, which conflicts with the description.

The schema contradicts the docs: `code` is documented as an alternative to `class_path`, but `class_path` is required. This blocks use cases where clients submit only raw code or only a `github_url`. Please adjust the schema (e.g., with `oneOf`/`anyOf` to require at least one of `class_path`, `github_url`, or `code`, or by making `class_path` optional and clarifying the validation rules) so the spec matches the intended behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive OpenAPI 3.0 specification for middleware management functionality in proTES, enabling dynamic management of middleware components at runtime. This is the first subtask in a multi-part implementation following a design-first approach, focusing solely on the API specification and documentation without implementing the actual controller logic.

Changes:

  • Added complete OpenAPI 3.0 specification defining 7 REST endpoints for middleware CRUD operations, reordering, and validation
  • Configured FOCA integration with new middleware management API routes and MongoDB collection
  • Added comprehensive documentation describing the API design, endpoints, data models, and implementation plan

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 17 comments.

File Description
pro_tes/api/middleware_management.yaml New OpenAPI 3.0 specification defining middleware management API with 7 endpoints and 9 schemas
pro_tes/config.yaml Added FOCA integration for middleware API, database collection configuration, and routing setup
docs/middleware.md Comprehensive documentation covering API design, implementation details, security considerations, and future work
Comments suppressed due to low confidence (1)

docs/middleware.md:210

  • The changelog date "2026-01-24" is in the future. This should be corrected to use the actual date when this work was completed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

description: |
Retrieve all configured middlewares with their order, metadata, and status.
Results are sorted by execution order (ascending) by default.
operationId: listMiddlewares
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. The TES API uses PascalCase (GetServiceInfo, CreateTask, GetTask, CancelTask, ListTasks) while the middleware API uses camelCase (listMiddlewares, addMiddleware, getMiddleware, updateMiddleware, deleteMiddleware, reorderMiddlewares, validateMiddleware). For consistency with the existing GA4GH TES API in this codebase, consider using PascalCase for all operationIds in the middleware API.

Copilot uses AI. Check for mistakes.
local class paths or fetched from GitHub repositories. If order is not specified,
the middleware is appended to the end of the stack. If order is specified,
existing middlewares at that position or higher are shifted up by one.
operationId: addMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (AddMiddleware instead of addMiddleware) to match the pattern used in the existing API (e.g., CreateTask, GetTask, etc. in pro_tes/api/9e9c5aa.task_execution_service.openapi.yaml).

Copilot uses AI. Check for mistakes.
get:
summary: Get middleware details
description: Retrieve detailed information about a specific middleware by ID
operationId: getMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (GetMiddleware instead of getMiddleware) to match the pattern used in the existing API.

Copilot uses AI. Check for mistakes.
description: |
Remove a middleware from the execution stack. By default performs soft delete
(sets enabled=false). Use force=true query parameter for hard deletion.
operationId: deleteMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (DeleteMiddleware instead of deleteMiddleware) to match the pattern used in the existing API.

Copilot uses AI. Check for mistakes.
Comment on lines 572 to 589
ValidationRequest:
type: object
description: Request body for validating middleware code
required:
- class_path
properties:
class_path:
type: string
description: Class path or code to validate
example: "pro_tes.plugins.middlewares.task_distribution.distance.TaskDistributionDistance"
github_url:
type: string
description: GitHub URL for fetching middleware code
nullable: true
pattern: '^https://raw\.githubusercontent\.com/.+\.py$'
example: "https://raw.githubusercontent.com/user/repo/main/middleware.py"
code:
type: string
description: Raw Python code to validate (alternative to class_path)
nullable: true
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ValidationRequest schema marks 'class_path' as required but also provides 'code' as an "alternative to class_path" (line 590). This is contradictory - if 'code' is truly an alternative, then 'class_path' should not be required. Consider using oneOf to express that either 'class_path' or 'code' is required, but not both, or clarify the validation logic in the description.

Copilot uses AI. Check for mistakes.
middlewares:
indexes:
- keys:
name: 1
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The database index is created on the 'name' field, but the OpenAPI spec indicates that both 'name' and 'class_path' should be unique (based on the 400 error description mentioning 'duplicate name/class_path'). Consider adding indexes on both 'name' and 'class_path' fields with unique constraints to enforce this at the database level, or clarify whether only 'name' needs to be unique.

Suggested change
name: 1
name: 1
options:
"unique": True
- keys:
class_path: 1
options:
"unique": True

Copilot uses AI. Check for mistakes.
Comment on lines 104 to 122
## File Structure
```
pro_tes/
├── api/
│ └── middleware_management.yaml (OpenAPI specification)
└── config.yaml (FOCA integration)

docs/
└── middleware.md (This file)
```

## Documentation Deliverables

**API Documentation**: Comprehensive guide with request/response examples for each endpoint. Includes curl commands, common use cases, and troubleshooting tips.

**Architecture Decision Record**: Documents twelve major design decisions with rationale, alternatives considered, and consequences. Serves as reference for future development.

**Postman Collection**: Ready-to-use collection with fourteen pre-configured requests. Includes environment variables, test scripts, and example data for all scenarios.

**Quick Reference**: Single-page reference with essential endpoints, parameters, and response codes. Designed for daily development use.

**Validation Script**: Bash script that validates OpenAPI syntax using multiple tools. Checks for common errors like undefined schema references and invalid endpoint definitions.

## Testing Approach

This subtask focuses on specification validation rather than runtime testing since no executable code is implemented yet. Validation performed:

**YAML Syntax**: Verified file parses correctly as valid YAML without syntax errors.

**OpenAPI Compliance**: Confirmed specification follows OpenAPI 3.0 standards including required fields, valid schema definitions, and proper reference resolution.

Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation references several files that are not included in this PR: 'docs/api/middleware_management.md', 'middleware_management.postman_collection.json', 'QUICK_REFERENCE.md', 'docs/architecture/middleware_api_design.md', and 'scripts/validate_openapi.sh'. Either these files should be included in this PR, or the documentation should clarify that these are planned deliverables for future PRs. The "Documentation Deliverables" section also describes these files in detail, which creates confusion about what is actually delivered in this subtask.

Copilot uses AI. Check for mistakes.
Comment on lines 647 to 671
ErrorResponse:
type: object
description: Standard error response
required:
- error
- message
properties:
error:
type: string
description: Error type/code
example: "MiddlewareNotFound"
message:
type: string
description: Human-readable error message
example: "Middleware with ID '507f1f77bcf86cd799439011' not found"
details:
type: object
description: Additional error details
nullable: true
additionalProperties: true
timestamp:
type: string
format: date-time
description: Error timestamp
example: "2026-01-24T10:30:00Z"
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ErrorResponse schema is inconsistent with the existing error handling convention. The FOCA exception configuration (pro_tes/config.yaml:120) requires error responses to have 'message' and 'code' fields, but the ErrorResponse schema defines 'error', 'message', 'details', and 'timestamp' fields without a 'code' field. For consistency with the existing error handling (see pro_tes/exceptions.py:52-121), the ErrorResponse schema should include a 'code' field containing the HTTP status code, and consider using 'code' instead of or in addition to 'error' for the error type.

Copilot uses AI. Check for mistakes.
Comment on lines 156 to 172
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"

Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace on empty line. Remove the trailing spaces for consistency with code style.

Suggested change
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"

Copilot uses AI. Check for mistakes.
Comment on lines 84 to 90
```yaml
specs:
- path:
- api/middleware_management.yaml
add_operation_fields:
x-openapi-router-controller: pro_tes.api.middlewares.controllers
disable_auth: True
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FOCA configuration for the middleware management API is setting disable_auth: True, which exposes all middleware management endpoints without any authentication. In any non-isolated environment this allows an unauthenticated attacker to list, create, modify, or delete middlewares, potentially leading to arbitrary code execution or unauthorized changes to workflow behavior. Require authentication for these endpoints (e.g., by wiring them into your existing security scheme instead of using disable_auth) and, if needed, restrict unauthenticated access to development-only configs that are not deployed to production.

Copilot uses AI. Check for mistakes.
@keshxvdayal keshxvdayal force-pushed the feature/middleware-api-spec branch from b3118e8 to 5025c12 Compare January 24, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants