Skip to content

Handle non-OvsFetchInterfaceAnswer in OVS tunnel manager#12860

Open
dheeraj12347 wants to merge 1 commit intoapache:4.22from
dheeraj12347:4.22
Open

Handle non-OvsFetchInterfaceAnswer in OVS tunnel manager#12860
dheeraj12347 wants to merge 1 commit intoapache:4.22from
dheeraj12347:4.22

Conversation

@dheeraj12347
Copy link

@dheeraj12347 dheeraj12347 commented Mar 19, 2026

When starting a GRE isolated network on XCP-ng, the management server can fail with a ClassCastException like UnsupportedAnswer cannot be cast to OvsFetchInterfaceAnswer in OvsTunnelManagerImpl.handleFetchInterfaceAnswer. This happens when the agent returns an UnsupportedAnswer (or another Answer type) for OvsFetchInterfaceCommand, but the code unconditionally casts the first Answer to OvsFetchInterfaceAnswer.

Root cause

handleFetchInterfaceAnswer assumed answers[0] is always an OvsFetchInterfaceAnswer and directly cast it without checking for null, array length, or actual runtime type. When the hypervisor agent responds with a different Answer implementation (e.g. UnsupportedAnswer), this results in a ClassCastException and the GRE tunnel setup fails.

Solution

Add null and length checks for the Answer[] to handle missing or empty responses.

Check answers[0] with instanceof OvsFetchInterfaceAnswer before casting.

If the answer is not an OvsFetchInterfaceAnswer, log a clear warning with the actual type and details, and return null instead of throwing a ClassCastException.

Preserve the existing success path when a successful OvsFetchInterfaceAnswer with a non-empty IP address is returned.

Testing

Local build on 4.22:

bash
mvn -pl api,server -am -DskipTests clean install

Addresses #12815

@weizhouapache
Copy link
Member

@dheeraj12347
I believe this PR handles the error.
However, we'd better know what caused the issue and fix the root cause. Otherwise, the feature is still non-functional

@weizhouapache
Copy link
Member

@dheeraj12347
are you able to reproduce the issue and verify your fix ?

@codecov
Copy link

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 0% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.61%. Comparing base (27bce46) to head (8486a89).
⚠️ Report is 4 commits behind head on 4.22.

Files with missing lines Patch % Lines
...va/com/cloud/network/ovs/OvsTunnelManagerImpl.java 0.00% 13 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #12860      +/-   ##
============================================
- Coverage     17.61%   17.61%   -0.01%     
+ Complexity    15662    15661       -1     
============================================
  Files          5917     5917              
  Lines        531415   531438      +23     
  Branches      64973    64974       +1     
============================================
+ Hits          93588    93589       +1     
- Misses       427271   427293      +22     
  Partials      10556    10556              
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.68% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@DaanHoogland DaanHoogland added this to the 4.22.1 milestone Mar 19, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the OVS tunnel manager’s handling of agent responses for OvsFetchInterfaceCommand to prevent management server failures (e.g., ClassCastException) when the agent returns a non-OvsFetchInterfaceAnswer (such as UnsupportedAnswer) during GRE isolated network setup on XCP-ng.

Changes:

  • Add null/empty checks for the Answer[] returned by the agent.
  • Guard the cast to OvsFetchInterfaceAnswer with an instanceof check and log a clear warning when the returned answer type is unexpected.
  • Minor cleanup of the IP string empty-check (isEmpty()).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

OvsTunnelInterfaceVO ti = createInterfaceRecord(ans.getIp(),
ans.getNetmask(), ans.getMac(), hostId, ans.getLabel());
ans.getNetmask(), ans.getMac(), hostId, ans.getLabel());
return ti.getIp();
@dheeraj12347
Copy link
Author

@dheeraj12347 are you able to reproduce the issue and verify your fix ?

I don’t currently have a full CloudStack 4.22 + XCP‑ng 8.3 + OVS/GRE environment to reproduce this end‑to‑end. If you think it’s important that I verify it myself, I can try to set up such a lab, but it may take some time given my hardware/resources.

@weizhouapache
Copy link
Member

@dheeraj12347

Normally, PR authors test their changes before requesting peer review. However, I understand that in some cases it may not be possible to reproduce the issue due to differences in hardware or environment, or because the issue itself is difficult to reproduce.

That said, I don’t think it’s necessary to acquire new hardware to reproduce and verify the fix in this case. It would be helpful if @UAnton could test the changes.

However, as mentioned in my previous comment, I don’t think this PR will fix the issue #12815. We need to understand why the expected OvsFetchInterfaceAnswer is not being returned, and why an UnsupportedAnswer is returned instead.

@DaanHoogland DaanHoogland linked an issue Mar 20, 2026 that may be closed by this pull request
@UAnton
Copy link

UAnton commented Mar 20, 2026

@dheeraj12347 @DaanHoogland @weizhouapache
Hi,
If you show me what I should do, then I will test it, of course...
P.S. I figured it out! I'm building deb packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GRE isolation + XCP-NG

5 participants