Conversation
|
Claude finished @amdove's task —— View job Review Complete
I've submitted a review with feedback on the session PDB implementation. Key suggestions:
|
There was a problem hiding this comment.
Code Review: Session PDBs for Workbench
Summary
This PR refactors CreateOrUpdateDisruptionBudget to accept a pre-built PDB spec and adds a new session PDB for workbench pods. The change improves separation of concerns and adds protection for user sessions during node drains.
Positive Aspects
- The refactored
CreateOrUpdateDisruptionBudgetsignature is cleaner - separating PDB construction from persistence improves testability - The new
DefineSessionDisruptionBudgetfunction is well-documented and follows existing patterns - Consistent updates across all three product controllers (Connect, PackageManager, Workbench)
Suggestions
api/product/disruption_budget.go:61-63
Consider extracting the launcher-instance-id label key as a constant since it represents an external contract with the launcher:
const LauncherInstanceIDLabel = "launcher-instance-id"api/product/disruption_budget.go:66-68
maxUnavailable=0 will block node drains indefinitely until sessions complete. This is likely intentional for session persistence, but worth documenting in operator docs that cluster admins need to ensure sessions have reasonable timeouts or plan for manual intervention during maintenance windows.
internal/controller/core/workbench.go:1006-1010
Per the controller review checklist: tests should exist for this change. Consider adding a test case to workbench_test.go to verify the session PDB is created with:
- The expected name pattern (
{component}-sessions) - The correct selector (
launcher-instance-id: {component}) maxUnavailable=0
Checklist Review
- Reconciliation appears idempotent (uses
CreateOrUpdateResource) - Config flows correctly from Workbench
- Tests for the new session PDB (suggested above)
Adds Pod Disruption Budgets (PDBs) for Workbench session pods to prevent user sessions from being evicted during node drains or cluster maintenance.
Problem
Previously, only Workbench server pods had PDB protection. Session pods (the user's R/Python sessions created by Launcher) had no eviction protection. During node drains or cluster scaling, Kubernetes could terminate active user sessions without warning.
Solution
Creates a second PDB for each Workbench spec that targets session pods:
This ensures user sessions remain persistent and uninterruptible during cluster operations.