Skip to content

swapped vm sometimes fails to start because PCIe device is still busy #1

@hyd3rs

Description

@hyd3rs

Description

a race condition is present in the swapper request worker, where the new QM's startup command beats the old QM's shutdown sequence, failing to start the new QM. this is a rare occurrence, but can occur in some fringe cases (i.e. hosting swapper on an external machine, network latency = misaligned sequence of events)

Potential mitigations/fixes

  • a quick-fix mitigation would be to add an arbitrary delay between completion of shutdown and the beginning of the new QM starting up.
  • a "proper" (more robust) fix would be to add failsafes/retries to a QM's startup sequence. this could be done in a similar manner to how swapper detects the full shutdown of a QM, where swapper polls VE every few seconds to verify that the machine's started up.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions