-
Notifications
You must be signed in to change notification settings - Fork 162
Support in-house test igvm agent for VMM HyeprV tests #2519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the test IGVM agent code from the guest_emulation_device into a standalone library (test_igvm_agent_lib) and introduces a Windows RPC server (test_igvm_agent_rpc_server) to enable attestation flow testing in VMM Hyper-V tests. The RPC server implements the IGVM Agent RPC interface using Windows MIDL-generated stubs.
Key Changes:
- Extracted test IGVM agent logic into reusable
test_igvm_agent_libcrate - Created Windows RPC server for host-side attestation testing
- Updated VMM tests to spawn and manage the RPC server during CVM TPM tests
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| vmm_tests/vmm_tests/tests/tests/x86_64/tpm.rs | Adds RPC server spawning and lifecycle management to CVM TPM tests |
| vmm_tests/petri_artifacts_vmm_test/src/lib.rs | Declares new test_igvm_agent_rpc_server artifact |
| vmm_tests/petri_artifact_resolver_openvmm_known_paths/src/lib.rs | Adds path resolution for RPC server executable |
| vm/devices/get/test_igvm_agent_rpc_server/src/rpc/server.rs | Implements Windows RPC server with console signal handling |
| vm/devices/get/test_igvm_agent_rpc_server/src/rpc/mod.rs | Module declaration for RPC server components |
| vm/devices/get/test_igvm_agent_rpc_server/src/rpc/igvm_agent.rs | Singleton facade exposing test agent through global instance |
| vm/devices/get/test_igvm_agent_rpc_server/src/rpc/handlers.rs | RPC method handlers and MIDL memory allocators |
| vm/devices/get/test_igvm_agent_rpc_server/src/main.rs | Entry point for standalone RPC server executable |
| vm/devices/get/test_igvm_agent_rpc_server/idl/IGVmAgentRpcApi.idl | MIDL interface definition for IGVM Agent RPC API |
| vm/devices/get/test_igvm_agent_rpc_server/build.rs | Build script for MIDL compilation and cross-compilation support |
| vm/devices/get/test_igvm_agent_rpc_server/Cargo.toml | Dependencies and build configuration for RPC server |
| vm/devices/get/test_igvm_agent_lib/src/test_crypto.rs | Minor import formatting change |
| vm/devices/get/test_igvm_agent_lib/src/lib.rs | Refactored library with public APIs and updated synchronization |
| vm/devices/get/guest_emulation_device/src/lib.rs | Removes inline agent code, now uses test_igvm_agent_lib |
| vm/devices/get/guest_emulation_device/Cargo.toml | Removes test_igvm_agent feature, adds lib dependency |
| openvmm/openvmm_resources/Cargo.toml | Removes test_igvm_agent feature flag |
| openhcl/underhill_attestation/Cargo.toml | Removes test_igvm_agent feature flag |
| flowey/flowey_lib_hvlite/src/lib.rs | Exports new build module |
| flowey/flowey_lib_hvlite/src/init_vmm_tests_env.rs | Registers RPC server in test environment |
| flowey/flowey_lib_hvlite/src/build_test_igvm_agent_rpc_server.rs | Build flow node for RPC server with static CRT linking |
| flowey/flowey_lib_hvlite/src/_jobs/local_build_and_run_nextest_vmm_tests.rs | Integrates RPC server into local test builds |
| flowey/flowey_lib_hvlite/src/_jobs/consume_and_test_nextest_vmm_tests_archive.rs | Adds RPC server to artifact dependencies |
| flowey/flowey_hvlite/src/pipelines/checkin_gates.rs | Publishes RPC server artifacts in CI pipeline |
| Cargo.toml | Adds workspace dependencies for new crates |
| Cargo.lock | Updates lock file with new dependencies |
Comments suppressed due to low confidence (4)
vm/devices/get/test_igvm_agent_lib/src/test_crypto.rs:16
- The import statement was split across two lines. While this is valid, the original single-line import
use sha2::digest::consts::{U20, U64};was more concise and easier to read. The split doesn't improve readability and adds an extra line without clear benefit.
vm/devices/get/test_igvm_agent_lib/src/lib.rs:108 - The new
plan_installedfield is added to track whether a plan has been installed, but the existing code had aOncesynchronization primitive (line removed) that ensured thread-safe one-time initialization across all instances. The new approach only prevents multiple installations per instance, not globally. If multipleTestIgvmAgentinstances are created and share state through the singleton pattern intest_igvm_agent_rpc_server/src/rpc/igvm_agent.rs, this could lead to race conditions or unexpected behavior. Consider whether thread-safety across instances is still required.
vm/devices/get/test_igvm_agent_lib/src/lib.rs:217 - The doc comment "Request handler." is too minimal for a public API function. It should describe what the function does, what the input bytes represent, what the return tuple contains (payload and expected length), and what errors can occur.
vm/devices/get/test_igvm_agent_lib/src/lib.rs:451 - The
initialize_keys,generate_mock_wrapped_key_response,generate_mock_key_release_response, andgenerate_jwt_with_rsa_keyfunctions were changed frompub(crate)tofn(private). This is correct since they are internal implementation details. However, this is inconsistent with the change description which says the code was "refactored...into a separate crate" - typically when refactoring into a library crate, you'd expect more items to become public to support different consumers, not fewer. Verify that all necessary APIs are exposed for the RPC server to function correctly.
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
chris-oo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still need to review the rest of the flowey changes, but see below
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
chris-oo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes otherwise LGTM but I've asked Justus to help review the flowey side in case I've missed something.
Do you know if this works with cross compilation from wsl, or no idea for the igvm agent? The build script is pretty involved it looks like.
justus-camp-microsoft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mostly, have a couple questions that I'm ok with opening issues for/discussion as to why not to do them.
| // Give the server a moment to start up and bind to the RPC endpoint. | ||
| // The server closes stdout when it's ready, but since we redirected | ||
| // stdout to a file, we can't detect that. Instead, we poll briefly. | ||
| std::thread::sleep(std::time::Duration::from_millis(500)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to add an endpoint to the server that we can poll and will indicate that it's ready? That would also possibly simplify paths where the server auto-starts etc. because you can just ping the endpoint and see if it's running or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider this option. But given that the PR uses the same idl file as the one in production, my preference is avoid modifying it.
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Yes, it works with cross-compilation from wsl. |
|
@chris-oo @justus-camp-microsoft can you take another look? |
justus-camp-microsoft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have some "flowey best practices" comments. The gist of them is that you should check ctx.platform(), bail if nodes are running on an unsupported platform, and not request nodes on platforms that shouldn't run them or are unsupported.
| after_tests.claim(ctx); | ||
| done.claim(ctx); | ||
| move |_rt| { | ||
| #[cfg(not(windows))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check ctx.platform() to check what platform flowey is running on and branch based on that. Also, like i said in the other comment let's anyhow::bail if it tries to run on an unsupported platform rather than no-oping the node so that it enforces the node not being requested if it's an unsupported platform.
| if matches!( | ||
| target.operating_system, | ||
| target_lexicon::OperatingSystem::Windows | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that cross-compiling from WSL with a Windows target would lead to the run_test_igvm_agent_rpc_server request being sent, which would then be no-op'd by the node. Typically with flowey we would prefer not to no-op inside of nodes and instead not send the request at all to avoid emitting steps that don't do anything.
The better thing to do here is check ctx.platform() and only make the request if the node is executing on Windows, which removes the need to no-op inside the node and won't emit a request unless the platform is Windows.
| let rpc_server_stopped = if matches!( | ||
| target.operating_system, | ||
| target_lexicon::OperatingSystem::Windows | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about no-oping here as other places.
| if matches!( | ||
| target_triple.operating_system, | ||
| target_lexicon::OperatingSystem::Windows | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about checking ctx.platform()
| let rpc_server_stopped = if matches!( | ||
| target_triple.operating_system, | ||
| target_lexicon::OperatingSystem::Windows | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx.platform()
| // Only run on Windows - the RPC server is Windows-only | ||
| #[cfg(not(windows))] | ||
| { | ||
| let _ = env; | ||
| log::info!("test_igvm_agent_rpc_server is Windows-only, skipping"); | ||
| Ok(()) | ||
| } | ||
|
|
||
| #[cfg(windows)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check ctx.platform() instead, and let's bail if it's not running on Windows instead of no-oping the node.
This PR refactors the test igvm agent code from the test GED into a separate crate and introduce Windows-style RPC server that is built on top of it. The RPC server allows us to test the attestation flow in the VMM HyperV tests.