-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description
a race condition is present in the swapper request worker, where the new QM's startup command beats the old QM's shutdown sequence, failing to start the new QM. this is a rare occurrence, but can occur in some fringe cases (i.e. hosting swapper on an external machine, network latency = misaligned sequence of events)
Potential mitigations/fixes
- a quick-fix mitigation would be to add an arbitrary delay between completion of shutdown and the beginning of the new QM starting up.
- a "proper" (more robust) fix would be to add failsafes/retries to a QM's startup sequence. this could be done in a similar manner to how swapper detects the full shutdown of a QM, where swapper polls VE every few seconds to verify that the machine's started up.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working