Skip to content

ToilWESServerWorkflowTest::test_run_and_cancel_workflows is flaky #5448

@adamnovak

Description

@adamnovak

In the CI run for the 9.2.0 release, there was a failure of src/toil/test/server/serverTest.py::ToilWESServerWorkflowTest::test_run_and_cancel_workflows, because the workflow didn't in fact cancel before the timeout we're supposed to not have to wait for.

This might be just starvation somehow, or there might be a real problem with cancellation, at least in some cases.

I noticed that the workflow ends up in CANCELED and then after that COMPLETE:

[2026-02-03T16:40:43+0000] [WSGI_0] [I] [toil.server.wes.tasks] Stopping process for task run-054d4c9aef76456daa79e328d265b594
[2026-02-03T16:40:44+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:46+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:48+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:50+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:52+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:55+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:57+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:40:59+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:01+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:03+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:05+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:07+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:09+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:11+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:13+0000] [MainThread] [I] [toil.test.server.serverTest] Waiting on workflow in state CANCELING
[2026-02-03T16:41:15+0000] [MainThread] [I] [toil.test.server.serverTest] Workflow reached state CANCELED
[2026-02-03T16:41:15+0000] [MainThread] [I] [toil.test.server.serverTest] Workflow reached state COMPLETE

That seems suspicious.

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1805

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions