Skip to content

feat(backups): implement Job backup strategy#1721

Draft
lllamnyp wants to merge 1 commit into
mainfrom
feat/jobdriver-implementation
Draft

feat(backups): implement Job backup strategy#1721
lllamnyp wants to merge 1 commit into
mainfrom
feat/jobdriver-implementation

Conversation

@lllamnyp
Copy link
Copy Markdown
Member

@lllamnyp lllamnyp commented Dec 12, 2025

What

Implements the strategy.backups.cozystack.io/Job backup strategy — a generic, application-agnostic driver that runs a user-supplied Kubernetes Job to create (and restore) a backup.

Background

This PR began as a draft written against an earlier shape of the backups API. The API has since moved to a BackupClass-based dispatch model, which broke the original draft on three counts:

  • BackupJob.Spec.StrategyRef and BackupJob.Spec.StorageRef were removed; the strategy and its parameters are now resolved from a BackupClass into a ResolvedBackupConfig.
  • Strategy handling moved from standalone per-strategy reconcilers to a dispatch model: BackupJobReconciler.Reconcile / RestoreJobReconciler.Reconcile resolve the strategy and switch into per-strategy methods (reconcileVelero, reconcileCNPG, reconcileAltinity, …).
  • The draft's template context (.Values, .Storage) diverged from the convention the other strategies adopted, and its .Storage half targeted a storage API that does not exist.

What changed

  • reconcileJob / reconcileJobRestore (internal/backupcontroller/jobstrategy_controller.go): replaces the standalone BackupJobStrategyReconciler with the two dispatch methods the BackupJob / RestoreJob reconcilers call. The Job strategy is the generic counterpart of the Altinity driver — same lifecycle (render PodTemplateSpec → run a one-shot batch/v1.Job → poll its terminal condition → emit a Backup), minus the ClickHouse-specific applicationRef gate, so it runs for any application. Shared package helpers (renderJobTemplate, jobConditionState, getApplicationUnstructured, status helpers, deterministic Job naming, controller refs) are reused.
  • Template context follows the in-tree convention used by the other drivers: .Application (live application object), .Release.Name / .Release.Namespace, .Mode (backup / restore), .Parameters (resolved BackupClass parameters), and .Backup (restore runs only). Strategy parameters round-trip through the Backup's driverMetadata so a later restore re-renders against the same values.
  • Restore re-renders the same template with .Mode == "restore". Cross-namespace restore is unsupported (the RestoreJob target reference is namespace-local), matching the Altinity driver.
  • JobSpec.Template doc + regenerated CRD: the field doc previously advertised .Values; it now documents the real context, and strategy.backups.cozystack.io_jobs.yaml is regenerated to match (the change is limited to that field's description).
  • Tests (jobstrategy_controller_test.go): backup-Job creation / args / labels / status, completion → Backup artifact with parameter round-trip, Job-failure surfacing, missing-strategy terminal failure, restore into the target namespace, restore completion, and the parameter-prefix unit. A deliberately generic application kind pins that the driver never gates on applicationRef.kind.

No RBAC changes are required: the controller already holds strategy.backups.cozystack.io/*, batch/jobs, apps.cozystack.io/*, and backups permissions.

Notes for reviewers

  • Backup and restore share one template, so a strategy author's PodTemplateSpec must branch on .Mode to do anything restore-specific. This matches the existing convention; worth confirming it's the intended contract for the Job strategy's docs/examples.

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 12, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d07bd4ac-e674-4d05-98c8-1ed100f75a52

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/jobdriver-implementation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @lllamnyp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request lays the foundational groundwork for a robust backup system by implementing the BackupJobStrategyReconciler. This controller intelligently interprets BackupJob resources, dynamically retrieves related application and strategy definitions, and then leverages a newly introduced generic templating engine to construct and deploy Kubernetes batch/v1.Job resources. This design ensures a highly configurable and adaptable approach to managing application backups within the Kubernetes ecosystem.

Highlights

  • BackupJob Strategy Reconciler: Introduced the initial implementation of the BackupJobStrategyReconciler which processes BackupJob custom resources. This reconciler is responsible for orchestrating backup operations based on defined strategies.
  • Dynamic Resource Fetching: The reconciler now dynamically fetches associated application and strategy resources using Kubernetes' dynamic client and REST mapper, allowing for flexible interaction with various custom resource definitions.
  • Generic Templating Utility: A new generic templating utility has been added, enabling the dynamic population of fields within Kubernetes object specifications. This is crucial for creating flexible and configurable backup jobs.
  • Kubernetes Job Creation: The controller now creates Kubernetes batch/v1.Job resources. These jobs are constructed using a templated PodTemplateSpec derived from the JobStrategy object, allowing for customized backup execution.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new JobStrategy implementation for BackupJob resources, allowing the creation of Kubernetes Jobs based on templated PodTemplateSpec definitions. It also includes a generic templating utility to process Go template expressions within Kubernetes objects. The changes are a good step towards a flexible backup strategy. However, there are a few areas that could be improved for robustness and error handling, particularly regarding the templating logic and handling of API group references.

Comment on lines +19 to +25
templateFunc := func(in string) string {
out, err := template(in, templateContext)
if err != nil {
return in
}
return out
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The templateFunc silently swallows errors from template(in, templateContext), returning the original string in instead. This can lead to subtle bugs where template expressions are malformed or refer to non-existent fields, but no error is propagated, and the resulting object contains untemplated strings. It would be more robust to propagate these errors or at least log them with a higher severity to make debugging easier.

Suggested change
templateFunc := func(in string) string {
out, err := template(in, templateContext)
if err != nil {
return in
}
return out
}
templateFunc := func(in string) string {
out, err := template(in, templateContext)
if err != nil {
// TODO: Log this error or return it from Template function for better error visibility
return in
}
return out
}

Comment on lines +56 to +64
if j.Spec.ApplicationRef.APIGroup != nil {
applicationRefAPIGroup = *j.Spec.ApplicationRef.APIGroup
}
if j.Spec.StrategyRef.APIGroup != nil {
strategyRefAPIGroup = *j.Spec.StrategyRef.APIGroup
}
if j.Spec.StorageRef.APIGroup != nil {
storageRefAPIGroup = *j.Spec.StorageRef.APIGroup
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The APIGroup fields in TypedLocalObjectReference are pointers to strings. If they are nil, the current logic initializes applicationRefAPIGroup, strategyRefAPIGroup, and storageRefAPIGroup to empty strings. While an empty string for APIGroup typically refers to the core Kubernetes API group, this behavior is implicit and not explicitly handled or documented. It might be clearer to explicitly check for nil and assign "" if that's the intended behavior for core API group resources, or handle it as an error if APIGroup is always expected to be set for non-core resources.

var applicationRefAPIGroup string
var strategyRefAPIGroup string
var storageRefAPIGroup string
if j.Spec.ApplicationRef.APIGroup != nil {
	applicationRefAPIGroup = *j.Spec.ApplicationRef.APIGroup
} else {
	// Default to core API group if not specified
	applicationRefAPIGroup = ""
}
if j.Spec.StrategyRef.APIGroup != nil {
	strategyRefAPIGroup = *j.Spec.StrategyRef.APIGroup
} else {
	strategyRefAPIGroup = ""
}
if j.Spec.StorageRef.APIGroup != nil {
	storageRefAPIGroup = *j.Spec.StorageRef.APIGroup
} else {
	storageRefAPIGroup = ""
}

Comment on lines +105 to +108
values, ok := app.Object["spec"].(map[string]any)
if !ok {
values = map[string]any{}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The type assertion app.Object["spec"].(map[string]any) and subsequent fallback to an empty map if the assertion fails could mask potential issues. If the application object's spec is expected to always be a map[string]any, a failure here might indicate a malformed resource or an unexpected API version. It would be safer to log a warning or return an error in such cases, rather than silently proceeding with an empty map, which could lead to incorrect template rendering.

values, ok := app.Object["spec"].(map[string]any)
if !ok {
	log.Info("Application object 'spec' field is not a map[string]any, using empty values", "application", client.ObjectKeyFromObject(app))
	values = map[string]any{}
}

@lllamnyp lllamnyp mentioned this pull request Dec 12, 2025
19 tasks
Base automatically changed from feat/backup-jobdriver-api to main December 15, 2025 06:53
@github-actions github-actions Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress and removed do-not-merge labels Apr 27, 2026
@lllamnyp lllamnyp force-pushed the feat/jobdriver-implementation branch from 15ce380 to 89c72b0 Compare May 22, 2026 08:38
@github-actions github-actions Bot added area/uncategorized PR auto-labeler could not map title scope to a known area/*; please review size/XXL This PR changes 1000+ lines, ignoring generated files labels May 22, 2026
@lllamnyp lllamnyp changed the title [backups][WIP] Draft Job strategy implementation feat(backups): implement Job backup strategy May 22, 2026
@github-actions github-actions Bot added area/storage Issues or PRs related to storage (linstor, seaweedfs, bucket, velero, harbor) kind/feature Categorizes issue or PR as related to a new feature labels May 22, 2026
Complete the strategy.backups.cozystack.io/Job driver as the generic,
app-agnostic counterpart to the Altinity driver: render the strategy
PodTemplateSpec, run it as a one-shot batch/v1.Job, poll its terminal
condition, and emit a Backup. Wire it into the existing BackupClass
dispatch via reconcileJob/reconcileJobRestore rather than a standalone
reconciler, matching the post-BackupClass backups API (the old draft
read BackupJob.Spec.StrategyRef/StorageRef, which no longer exist).

Template context follows the in-tree convention (.Application, .Release,
.Mode, .Parameters, .Backup); the JobSpec.Template doc and regenerated
CRD are updated to match. Restore re-renders the same template with
.Mode=restore; cross-namespace restore is unsupported, mirroring
Altinity.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
@lllamnyp lllamnyp force-pushed the feat/jobdriver-implementation branch from 89c72b0 to 0291149 Compare May 22, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage Issues or PRs related to storage (linstor, seaweedfs, bucket, velero, harbor) area/uncategorized PR auto-labeler could not map title scope to a known area/*; please review do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress kind/feature Categorizes issue or PR as related to a new feature size/XXL This PR changes 1000+ lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant