fix(storage): markSnapshotAsVolume before delete origin volume bits#4014
fix(storage): markSnapshotAsVolume before delete origin volume bits#4014MatheMatrix wants to merge 1 commit into
Conversation
ZSTAC-76704 / TIC-5745: in APIUndoSnapshotCreationMsg, after blockCommit succeeds on the data plane, a failure when deleting the origin volume bits aborted the undo flow and rolled back, leaving the management-plane install path pointing at the origin path that was already committed away on the data plane (control/data plane inconsistency). Re-order the undo flows so MarkSnapshotAsVolume (update-db-install-path) runs before delete-origin-volume-bits, and on delete failure submit a PrimaryStorageDeleteBitGC job instead of failing the whole undo. Backport of 5.4.0 MR 8211 to 4.8.36. Add UndoSnapshotCreationDeleteBitsFailureCase covering: undo still succeeds on delete failure, volume install path updated, flow ordering, and GC job submission. Change-Id: Ia82c3126bf4ad5815584f6ceaaee42beb8d25b38
变更概览在快照撤销流程中,调整了步骤执行顺序:先执行 变更详情快照撤销流程的弹性删除与垃圾回收
评估代码审查工作量🎯 3 (中等复杂度) | ⏱️ ~25 分钟 诗集
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/src/test/groovy/org/zstack/test/integration/storage/snapshot/UndoSnapshotCreationDeleteBitsFailureCase.groovy (1)
83-97: ⚡ Quick win建议把 delete-bits 的目标路径也断言出来。
现在这里只用 DB 中的
installPath证明MarkSnapshotAsVolume先执行,但没有校验DeleteVolumeBitsOnPrimaryStorageMsg实际删除的仍然是originPath。如果以后误把新路径传给 delete-bits,这个用例仍可能通过。建议在 simulator 里同时记录 delete 请求里的路径并断言它等于originPath。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/src/test/groovy/org/zstack/test/integration/storage/snapshot/UndoSnapshotCreationDeleteBitsFailureCase.groovy` around lines 83 - 97, In the env.afterSimulator(LocalStorageKvmBackend.DELETE_BITS_PATH) handler (the DeleteBitsRsp block) record the delete request's target path from the incoming simulator message and assert it equals the expected originPath in addition to the existing DB installPath check; specifically, inside the afterSimulator closure capture the path field from the delete-bits request payload, store it (e.g. deleteRequestPath) and add an assertion that deleteRequestPath == originPath so the test fails if delete-bits is invoked on a different path than the original snapshot.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@test/src/test/groovy/org/zstack/test/integration/storage/snapshot/UndoSnapshotCreationDeleteBitsFailureCase.groovy`:
- Around line 83-97: In the
env.afterSimulator(LocalStorageKvmBackend.DELETE_BITS_PATH) handler (the
DeleteBitsRsp block) record the delete request's target path from the incoming
simulator message and assert it equals the expected originPath in addition to
the existing DB installPath check; specifically, inside the afterSimulator
closure capture the path field from the delete-bits request payload, store it
(e.g. deleteRequestPath) and add an assertion that deleteRequestPath ==
originPath so the test fails if delete-bits is invoked on a different path than
the original snapshot.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e786214e-da75-47f2-92ce-04c14acdaad5
📒 Files selected for processing (2)
storage/src/main/java/org/zstack/storage/volume/VolumeBase.javatest/src/test/groovy/org/zstack/test/integration/storage/snapshot/UndoSnapshotCreationDeleteBitsFailureCase.groovy
问题
ZSTAC-76704 / TIC-5745:
APIUndoSnapshotCreationMsg在 blockCommit 数据面成功后,删除源云盘底层数据(delete-origin-volume-bits)失败会触发控制面回滚,导致管理面 installPath 仍指向已被 commit 掉的旧路径 —— 控制面/数据面不一致,后续快照删除报unable to find volume。根因
undo 流程中
delete-origin-volume-bits排在update-db-install-path(MarkSnapshotAsVolume) 之前。删除底层数据失败 →trigger.fail→ MarkSnapshotAsVolume 没执行 → 卷安装路径未更新。修复
backport 5.4.0 MR 8211 到 4.8.36:
MarkSnapshotAsVolume(update-db-install-path) 调到delete-origin-volume-bits之前trigger.fail,改为提交PrimaryStorageDeleteBitGC后trigger.next()继续测试
新增
UndoSnapshotCreationDeleteBitsFailureCase集成测试,注入 delete-bits 失败后断言:PrimaryStorageDeleteBitGCjob./runMavenProfile premium全量编译通过;mvn test -Dtest=UndoSnapshotCreationDeleteBitsFailureCase→Tests run: 1, Failures: 0, Errors: 0。Closes ZSTAC-76704
sync from gitlab !9907