Add Retry module for rescheduling failed builds on a different distribution#47
Add Retry module for rescheduling failed builds on a different distribution#47ntyni wants to merge 1 commit intodebomatic:masterfrom
Conversation
…bution The primary use case for this is regression testing, where we want to see if a set of packages causes build failures that do not happen without them. Having two distributions that differ only in these packages, we can automatically trigger a retry on the 'base' distribution if there's a failure on the 'modified' one. The module is trivial otherwise, but there's an unfortunate race condition where when we requeue a build from a post build hook, another thread starts sbuild but the original one proceeds to clean up its files including the .dsc in the incoming directory. This results in sbuild failing in a very confusing way. This is currently worked around by inserting a sleep call after requeuing so the new sbuild can get its act together before the hooks are finished. A better fix would probably be to have a distribution specific place where sbuild downloads its files.
dktrkranz
left a comment
There was a problem hiding this comment.
I haven't checked yet, but I think providing an unsigned command file would cause its rejection if GPG support is enabled.
|
|
||
| with open('%s/%s' % (incoming, commands_file), 'w') as fd: | ||
| command = 'rebuild %s %s' % (args.package, retry_dist) | ||
| fd.write(command + '\n') |
There was a problem hiding this comment.
I haven't checked yet, but I think providing an unsigned command file would cause its rejection if GPG support is enabled.
There was a problem hiding this comment.
Indeed. Obviously we don't have gpg support enabled. Do you have any suggestions on how to tackle this? I suppose it could sign the file if it has an encryption key in the keyring.
| fd.write(command + '\n') | ||
|
|
||
| # work around a race condition with sbuild | ||
| sleep(10) |
There was a problem hiding this comment.
Ideally if DoM refused to enqueue a build if the same package is already building would make this call useless.
There was a problem hiding this comment.
To me that seems a bit like a workaround for the lack of separation between build workers. In general, being able to build the same package in parallel for two different distributions seems like a useful thing to have. But that's not needed here as the builds are sequential, and I agree it would remove the race.
(There's a trivial conflict with #46 in the docs, happy to rebase on that if you like. Any suggestions for better fixing the race condition described below would be very welcome. Thanks!)
The primary use case for this is regression testing, where we want
to see if a set of packages causes build failures that do not happen
without them.
Having two distributions that differ only in these packages, we
can automatically trigger a retry on the 'base' distribution if
there's a failure on the 'modified' one.
The module is trivial otherwise, but there's an unfortunate race condition
where when we requeue a build from a post build hook, another thread
starts sbuild but the original one proceeds to clean up its files
including the .dsc in the incoming directory. This results in sbuild
failing in a very confusing way.
This is currently worked around by inserting a sleep call after requeuing
so the new sbuild can get its act together before the hooks are finished.
A better fix would probably be to have a distribution specific place
where sbuild downloads its files.