Skip to content

Add Retry module for rescheduling failed builds on a different distribution#47

Open
ntyni wants to merge 1 commit intodebomatic:masterfrom
ntyni:retry
Open

Add Retry module for rescheduling failed builds on a different distribution#47
ntyni wants to merge 1 commit intodebomatic:masterfrom
ntyni:retry

Conversation

@ntyni
Copy link
Contributor

@ntyni ntyni commented Jun 10, 2018

(There's a trivial conflict with #46 in the docs, happy to rebase on that if you like. Any suggestions for better fixing the race condition described below would be very welcome. Thanks!)

The primary use case for this is regression testing, where we want
to see if a set of packages causes build failures that do not happen
without them.

Having two distributions that differ only in these packages, we
can automatically trigger a retry on the 'base' distribution if
there's a failure on the 'modified' one.

The module is trivial otherwise, but there's an unfortunate race condition
where when we requeue a build from a post build hook, another thread
starts sbuild but the original one proceeds to clean up its files
including the .dsc in the incoming directory. This results in sbuild
failing in a very confusing way.

This is currently worked around by inserting a sleep call after requeuing
so the new sbuild can get its act together before the hooks are finished.
A better fix would probably be to have a distribution specific place
where sbuild downloads its files.

…bution

The primary use case for this is regression testing, where we want
to see if a set of packages causes build failures that do not happen
without them.

Having two distributions that differ only in these packages, we
can automatically trigger a retry on the 'base' distribution if
there's a failure on the 'modified' one.

The module is trivial otherwise, but there's an unfortunate race condition
where when we requeue a build from a post build hook, another thread
starts sbuild but the original one proceeds to clean up its files
including the .dsc in the incoming directory. This results in sbuild
failing in a very confusing way.

This is currently worked around by inserting a sleep call after requeuing
so the new sbuild can get its act together before the hooks are finished.
A better fix would probably be to have a distribution specific place
where sbuild downloads its files.
Copy link
Contributor

@dktrkranz dktrkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked yet, but I think providing an unsigned command file would cause its rejection if GPG support is enabled.


with open('%s/%s' % (incoming, commands_file), 'w') as fd:
command = 'rebuild %s %s' % (args.package, retry_dist)
fd.write(command + '\n')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked yet, but I think providing an unsigned command file would cause its rejection if GPG support is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Obviously we don't have gpg support enabled. Do you have any suggestions on how to tackle this? I suppose it could sign the file if it has an encryption key in the keyring.

fd.write(command + '\n')

# work around a race condition with sbuild
sleep(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally if DoM refused to enqueue a build if the same package is already building would make this call useless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me that seems a bit like a workaround for the lack of separation between build workers. In general, being able to build the same package in parallel for two different distributions seems like a useful thing to have. But that's not needed here as the builds are sequential, and I agree it would remove the race.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants