Skip to content

docs: addon-dataprotection-cleanup-order-guide (N=4 reproducible KB DataProtection cleanup-order finalizer deadlock)#92

Merged
weicao merged 1 commit intomainfrom
william/dataprotection-cleanup-order-guide
May 7, 2026
Merged

docs: addon-dataprotection-cleanup-order-guide (N=4 reproducible KB DataProtection cleanup-order finalizer deadlock)#92
weicao merged 1 commit intomainfrom
william/dataprotection-cleanup-order-guide

Conversation

@weicao
Copy link
Copy Markdown
Contributor

@weicao weicao commented May 7, 2026

Summary

Why

Across 4 successive MySQL line tasks, increasingly strict cleanup workarounds all hit the same finalizer deadlock:

The methodology is reusable by any addon that uses KubeBlocks DataProtection resources. The source-level root cause is still marked pending source-code reading.

Allen curator adjustments

  • Rebased the branch onto current main.
  • Removed the placeholder link to a case directory that does not exist yet.
  • Narrowed the root-cause language: observed layer is control_plane / KB DataProtection cleanup-order; exact controller path remains a hypothesis until source reading.
  • Added the callable dataprotection-cleanup-order skill and indexed it in skills/README.md.

Test plan

  • git diff --check origin/main...HEAD
  • guide/case intro metadata coverage check
  • docs/SKILL-INDEX.md coverage check
  • docs/skills local link check
  • skill frontmatter check
  • plugin manifest validation (passes with existing root-context warning)
  • /dataprotection-cleanup-order Say exactly: LOADED via fresh local skill smoke
  • PR body and commit attribution check clean; human coauthors William / Henry are allowed

…l finalizer deadlock

Documents N=4 reproducible KB DataProtection cleanup-order finalizer deadlock
pattern observed across MySQL Addon line task #11 / #14 / #15 / #16:

- task #11 (case A): deletionPolicy=Delete + ns Terminating + cleanup Job RoleBinding admission deny
- task #14 (case B): Retain prophylactic alone insufficient; finalizer still stuck
- task #15 (case C): Retain + sequential delete still insufficient
- task #16 (case D): Retain + sequential + user-managed MinIO server last still insufficient

Body covers:
- Dependency chain explanation (namespace controller waits BackupRepo waits
  Backup waits DataProtection controller reconcile waits ns-scoped operations
  rejected by admission during ns Terminating)
- Recommended sequential delete order with wait-NotFound at each layer
- Emergency force-remove finalizer pattern with evidence transparency
- Controller log filter discipline (bounded window, explicit scope)
- Anti-patterns
- 4-case appendix table linking back to MySQL line tasks
- Upstream fix hypotheses pending source-code reading

Companion sibling doc to addon-narrow-scope-force-delete-guide.md (pod-level
finalizer deadlock during image pull); both document finalizer deadlock at
different scopes with the "evidence-first then targeted force-remove" doctrine.

Adds:
- New file: docs/addon-dataprotection-cleanup-order-guide.md
- SKILL-INDEX.md: section 4 (运行期/排障) and 文档全列表 entries
- addon-narrow-scope-force-delete-guide.md: reverse anchor in 与其他文档的关系

Co-authored-by: William <william@apecloud.com>
Co-authored-by: Henry <henry@apecloud.com>
@weicao weicao force-pushed the william/dataprotection-cleanup-order-guide branch from 7e2353b to 828986e Compare May 7, 2026 08:23
@weicao weicao merged commit ff958bf into main May 7, 2026
@weicao weicao deleted the william/dataprotection-cleanup-order-guide branch May 7, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant