<fix>[ha]: defer skip-trace list cleanup on MN departure to prevent split-brain#3757
<fix>[ha]: defer skip-trace list cleanup on MN departure to prevent split-brain#3757MatheMatrix wants to merge 3 commits into4.8.37from
Conversation
…plit-brain When a management node departs, its VM skip-trace entries were immediately removed. If VMs were still being started by kvmagent, the next VM sync would falsely detect them as Stopped and trigger HA, causing split-brain. Fix: transfer departed MN skip-trace entries to an orphaned set with 10-minute TTL instead of immediate deletion. VMs in the orphaned set remain skip-traced until the TTL expires or they are explicitly continued, preventing false HA triggers during MN restart scenarios. Resolves: ZSTAC-80821 Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34 Conflicts: plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java
…anup Resolves: ZSTAC-80821 Change-Id: I59284c4e69f5d2ee357b1836b7c243200e30949a
Resolves: ZSTAC-80821 Change-Id: Ia9a9597feceb96b3e6e22259e2d0be7bde8ae499
工作流程在KVM插件中新增了孤立虚拟机跳过条目的全局配置超时参数,并在虚拟机同步任务中实现了基于TTL的孤立跳过条目跟踪机制,用于管理已离线管理节点的遗留VM跳过标记。 变更
代码审查工作量估计🎯 3 (Moderate) | ⏱️ ~20 minutes 诗
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java`:
- Around line 494-504: The explicit "continue" path currently only removes
entries from vmsToSkip but not from orphanedSkipVms, so VMs explicitly continued
still remain skipped until TTL expires; update the VM_CONTINUE_TRACE_PATH
handler (the callback that clears vmsToSkip) to also remove the VM UUID from
orphanedSkipVms (use orphanedSkipVms.remove(vmUuid)) and ensure any code paths
that mark a VM as resumed/continued clear both vmsToSkip and orphanedSkipVms so
explicit continue immediately resumes tracing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6cae9136-1be4-4162-b06a-b5a4f2a5dd9b
📒 Files selected for processing (2)
plugin/kvm/src/main/java/org/zstack/kvm/KVMGlobalConfig.javaplugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java
| // ZSTAC-80821: Also check orphaned skip entries from departed MN nodes | ||
| Long orphanedAt = orphanedSkipVms.get(vmUuid); | ||
| if (orphanedAt != null) { | ||
| if (System.currentTimeMillis() - orphanedAt < getOrphanTtlMs()) { | ||
| logger.debug(String.format("VM[uuid:%s] is in orphaned skip set, skipping trace", vmUuid)); | ||
| return true; | ||
| } else { | ||
| // Expired, clean up | ||
| orphanedSkipVms.remove(vmUuid, orphanedAt); | ||
| logger.info(String.format("orphaned skip entry for VM[uuid:%s] expired after %d minutes, resuming trace", | ||
| vmUuid, getOrphanTtlMs() / 60000)); |
There was a problem hiding this comment.
显式 continue 还不会结束 orphaned skip。
这里把 orphaned 条目的退出条件实现成了“仅 TTL 过期”,但 PR 目标还有“explicitly continued”。当前 VM_CONTINUE_TRACE_PATH 回调只清理 vmsToSkip,不会清理 orphanedSkipVms,所以 VM 实际已经完成启动/恢复后,仍然会继续被 skip 到 TTL 结束,HA 会被额外延后。
🔧 建议修复
if (data1.getApiId() != null && vmApis.containsKey(data1.getManagementNodeId()) && vmApis.get(data1.getManagementNodeId()).contains(data1.getApiId())) {
String vmUuid = vmApis.get(data1.getManagementNodeId()).remove(data1.getApiId());
logger.info("Continuing tracing VM: " + vmUuid);
vmsToSkip.get(data1.getManagementNodeId()).remove(vmUuid);
+ orphanedSkipVms.remove(vmUuid);
return;
}
if (data1.getVmUuid() != null) {
logger.info("Continuing tracing VM: " + data1.getVmUuid());
vmsToSkip.get(data1.getManagementNodeId()).remove(data1.getVmUuid());
+ orphanedSkipVms.remove(data1.getVmUuid());
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java` around lines
494 - 504, The explicit "continue" path currently only removes entries from
vmsToSkip but not from orphanedSkipVms, so VMs explicitly continued still remain
skipped until TTL expires; update the VM_CONTINUE_TRACE_PATH handler (the
callback that clears vmsToSkip) to also remove the VM UUID from orphanedSkipVms
(use orphanedSkipVms.remove(vmUuid)) and ensure any code paths that mark a VM as
resumed/continued clear both vmsToSkip and orphanedSkipVms so explicit continue
immediately resumes tracing.
When a management node departs, its VM skip-trace entries were
immediately removed. If VMs were still being started by kvmagent,
the next VM sync would falsely detect them as Stopped and trigger
HA, causing split-brain.
Fix: transfer departed MN skip-trace entries to an orphaned set with
10-minute TTL instead of immediate deletion. VMs in the orphaned set
remain skip-traced until the TTL expires or they are explicitly
continued, preventing false HA triggers during MN restart scenarios.
Resolves: ZSTAC-80821
Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34
Conflicts:
plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java
sync from gitlab !9626