Abstract - Continuous Integration (CI) is a process for automatically checking patch sets for errors. CI periodically fails due to non-deterministic (a.k.a., "flaky") behaviour. Since a patch set may not be the cause of a flaky failure, developers can issue a "recheck" command to request retesting a patch set. Developers waste time considering whether or not to issue a recheck after a CI failure. Prior work also shows that rechecks are issued liberally, wasting up to 187.4 compute years when CI continues to fail. To save developer time and avoid wasteful rechecks, we fit and analyze statistical models that discriminate between successful and failing rechecks, i.e., those rechecks that will change a failing CI run into a successful one and those that will fail again. Through an empirical study of 314,947 recheck requests from OpenStack, we find that our model can differentiate successful and failed rechecks well, outperforming baseline approaches by 23.6 percentage points in terms of AUROC (0.736).
Analysis of our model suggests that, in terms of explanatory power, past behaviour of jobs, bots, and users dominate static characteristics of patch sets. Applying our model to automatically request rechecks for those predicted to succeed would have saved roughly 247 years of elapsed developer time for OpenStack. Applying our model to skip recheck requests when they are predicted to fail would avoid 86.49% of wasted rechecks, saving roughly 262 years of compute time.
Preprint - PDF
Bibtex
@inproceedings{brus2025ase,
Author = {Yelizaveta Brus and Rungroj Maipradit and Earl T. Barr and Shane McIntosh},
Title = {{Rechecking Recheck Requests in Continuous Integration: An Empirical Study of OpenStack}},
Year = {2025},
Booktitle = {Proc. of the International Conference on Automated Software Engineering (ASE)},
Pages = {To appear}
}