-
Notifications
You must be signed in to change notification settings - Fork 25
Description
The test_mean_estimator_boosting.py tests are flaky and are tickled by changes to the file or supporting code. Example changes that tickle creating non-deterministic failures include seeds, nufft libs, basis, etc.
Multiple tests have similar flakyness. test_boost_flag, test_mse, and test_weighted_volumes.
test_weighted_volumes seems to show up the most online but the least when I try to manually reproduce, but I think they are all same general theme of issue.
I captured a few failures and attached them as a tarball. After manually reviewing the volumes, I believe the code may be functioning okay for the problem, but the testing expectation side questionable.
Example graphics from the tarball for a "failing" C1 are below. Estimates on top, reference on bottom.
If it continues to come up I'd suggest just xfailing the suite until it can be patched up.