Skip to content

Conversation

@daviftorres
Copy link
Contributor

Description

This trivial change provide more details for root administrators during troubleshooting.

See discussion #11980

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

image

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.46%. Comparing base (f570e16) to head (8807fd9).
⚠️ Report is 188 commits behind head on main.

Files with missing lines Patch % Lines
...n/java/com/cloud/vm/VirtualMachineManagerImpl.java 0.00% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11981      +/-   ##
============================================
- Coverage     17.46%   17.46%   -0.01%     
+ Complexity    15516    15514       -2     
============================================
  Files          5913     5913              
  Lines        529385   529389       +4     
  Branches      64679    64680       +1     
============================================
  Hits          92450    92450              
- Misses       426517   426521       +4     
  Partials      10418    10418              
Flag Coverage Δ
uitests 3.58% <ø> (ø)
unittests 18.52% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@DaanHoogland DaanHoogland self-requested a review November 6, 2025 16:12
@DaanHoogland
Copy link
Contributor

not sure if this makes sense, but type breaks as it expects a throwable.

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

Copy link
Contributor

@nvazquez nvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @daviftorres - just a comment from my review

@DaanHoogland
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 15819

@DaanHoogland
Copy link
Contributor

sorry if I have been misleading you @daviftorres , but callingUser needs first be defined. It is not available in the method as we speak :/ This might not be possible with the available context, hence the “needs testing” note.

@nvazquez , any idea how to implement your requirement from #11981 (comment) in this asynchronous context?

@nvazquez
Copy link
Contributor

nvazquez commented Nov 24, 2025

Hi @daviftorres I have pushed a fix to your branch/PR to address the comment. I have also reverted the 'Guest Network' text for just 'Network' as the VM starting may also be a system VM or a VR. cc. @DaanHoogland

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 15820

@nvazquez
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14861)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 48530 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11981-t14861-kvm-ol8.zip
Smoke tests completed. 150 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

daviftorres and others added 4 commits December 15, 2025 09:07
This trivial change provide more details for root administrators during troubleshooting.

See discussion apache#11980
…ManagerImpl.java

Co-authored-by: dahn <daan.hoogland@gmail.com>
Copy link
Contributor

@nvazquez nvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 27, 2026

@blueorangutan package

@blueorangutan
Copy link

@RosiKyu a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16560

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 28, 2026

Sorry, verification failed:

With the following:
The PR improves error messaging by:

  • Showing detailed failure info (e.getMessage()) for admin users

  • Showing generic "Please contact administrator" for regular users

  • Including VM UUID in the error message

  • Code Change

// Before
throw new CloudRuntimeException("Network is unavailable. Please contact administrator", e).add(VirtualMachine.class, vmUuid);

// After
Account callingAccount = CallContext.current().getCallingAccount();
String errorSuffix = (callingAccount != null && callingAccount.getType() == Account.Type.ADMIN) ?
    String.format("Failure: %s", e.getMessage()) :
    "Please contact administrator.";
throw new CloudRuntimeException(String.format("The Network for VM %s is unavailable. %s", vmUuid, errorSuffix), e).add(VirtualMachine.class, vmUuid);
  • Test Scenarios Attempted

Scenario 1: Stop VR and Deploy VM
Steps:

  1. Created isolated network test-network
  2. Deployed VM to trigger VR creation
  3. Stopped VR via CloudMonkey: stop router id=<uuid>
  4. Attempted to deploy another VM

Result: Could not reproduce
CloudStack automatically started the VR when deploying a new VM.


Scenario 2: Destroy VR at Hypervisor Level
Steps:

  1. Destroyed VR using virsh destroy r-4-VM on KVM host
  2. Attempted to deploy a new VM

Result: Code path not triggered
Error received:

Unable to start a VM [282c7323-bc57-4925-a80f-2d822346c3e0] due to [Unable to create a deployment for VM instance {...}]

Log analysis:

ResourceUnavailableException: Resource [DataCenter:1] is unreachable: Unable to apply dhcp entry on router

The exception scope was DataCenter, not VirtualRouter.class. The PR's condition e.getScope().equals(VirtualRouter.class) was not satisfied.


Scenario 3: Simulate VR Version Mismatch via DB
Steps:

  1. Updated database: UPDATE domain_router SET update_state = 'UPDATE_NEEDED' WHERE id = 4;
  2. Updated database: UPDATE domain_router SET software_version = '4.19.0.0' WHERE id = 4;
  3. Stopped VR
  4. Attempted to deploy a new VM

Result: Could not reproduce
CloudStack deployed the VM successfully, ignoring the UPDATE_NEEDED state.

Note: Global setting router.version.check = true was already enabled.


Observations

  1. Auto-start behavior: CloudStack automatically starts a stopped VR when a VM deployment requires it.

  2. Exception scope mismatch: The ResourceUnavailableException we triggered had scope DataCenter, not VirtualRouter.class. The PR's code path requires VirtualRouter.class scope.

  3. Version check bypass: Setting update_state = 'UPDATE_NEEDED' in the database did not prevent VM deployments or trigger the expected error path.

  4. Original issue context: The original issue (Improve message "Network is unavailable. Please contact administrator" #11980) describes a scenario with actual VR version mismatch (VR on 4.20.1, MS on 4.20.2) during a live upgrade. This specific condition could not be simulated in the test environment.

Conclusion

Status: Unable to reproduce the exact error path

The specific code path modified by PR #11981 requires a ResourceUnavailableException with scope VirtualRouter.class. Despite multiple attempts to simulate VR unavailability:

  • Stopping the VR
  • Destroying the VR at hypervisor level
  • Manipulating database version/state fields

I could not trigger the exact condition where this error message would be displayed.

Copy link
Collaborator

@RosiKyu RosiKyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@borisstoyanov borisstoyanov merged commit ded975c into apache:main Jan 28, 2026
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants