Improving the Robustness and Efficiency of Continuous Integration and Deployment

Authors - Keheliya Gallaba
Venue - McGill University, pp. 1-149, 2021

Related Tags - Theses 2021 continuous integration build performance

Abstract - Modern software is developed at a rapid pace. To sustain that rapid pace, organizations rely heavily on automated build, test, and release steps. To that end, Continuous Integration and Continuous Deployment (CI/CD) services take the incremental codebase changes that are produced by developers, compile them, link, and package them into software deliverables, verify their functionality, and deliver them to end users.

While CI/CD processes provide mission-critical features, if they are misconfigured or poorly operated, the pace of development may be slowed or even halted. To prevent such issues, in this thesis, we set out to study and improve the robustness and efficiency of CI/CD processes.

First, we present two empirical studies that focus on robust configuration of CI/CD processes. To understand the ways in which CI/CD features are being used, we analyze a curated sample of 9,312 open source projects that are hosted on GitHub and have adopted the popular Travis CI service. We find that explicit deployment code is rare. Then, to analyze feature misuse, we propose Hansel—an anti-pattern detection tool for Travis CI specifications. We define four anti-patterns and Hansel detects anti-patterns in the Travis CI specifications of 894 projects (10%) in the corpus. Furthermore, we propose Gretel—an anti-pattern removal tool for Travis CI specifications, which can remove 70% of the most frequently occurring anti-pattern automatically.

Our third empirical study focuses on robust CI/CD outcome data. In this work, we use openly available project metadata and CI/CD results of 1,276 GitHub projects that use Travis CI, to better understand the extent to which noise and heterogeneity are present in CI/CD outcome data. We find that: (1) 12% of passing builds have an actively ignored failure; (2) 9% of builds have a misleading or incorrect outcome on average; and (3) at least in 44% of the broken builds, the breakage is local to a subset of build variants.

iOur fourth empirical study focuses on improving the efficiency of CI/CD services. We propose a programming language-agnostic approach to infer data from which build acceleration decisions can be made without relying upon build specifications. After inferring this data, our approach accelerates CI builds by caching the build environment and skipping unaffected build steps. To evaluate our approach, we mine 14,364 historical CI build records spanning three proprietary and seven open-source software projects. We find that accelerated builds achieve a substantial speed-up (two-fold in 74% of accelerated builds) with minimal resource overhead (i.e., < 1% median CPU usage, 2 MB – 2.2 GB median memory usage, and 0.4 GB – 5.2 GB median storage overhead).

Our final empirical study identifies opportunities for service providers to improve robustness and efficiency of CI/CD processes by analyzing signal-generating builds (i.e., builds that pass or fail due to project factors) and non-signal-generating builds (e.g., incompleted builds due to provider infrastructure issues). In this study, we analyze 23.3 million builds spanning 7,795 open source projects that used the CircleCI service from 2012 to 2020. Our observations demonstrate the ways in which existing research breakthroughs (e.g., build acceleration, automated program repair) may benefit CI/CD providers, as well as the ways in which these approaches should be tailored to generate the most value. For example, since the heaviest users account for a growing proportion of the build activity and resources over the studied time period (measures of inequality like the Gini coefficient growing from 14% to 98%), approaches that are catered to optimizing these projects will likely generate more value for service providers than blanket solutions. Furthermore, efficiency in CI pipelines can be improved by reducing bottlenecks in the compilation and testing stages of signal-generating builds. Addressing configuration and resource allocation issues will reduce the number of non-signal-generating builds, increasing the robustness of CI pipelines.

Preprint - PDF


  Author = {Keheliya Gallaba},
  Title = {{Improving the Robustness and Efficiency of Continuous Integration and Deployment}},
  Year = {2021},
  School = {McGill University},
  Address = {3480 Rue University, Montréal, QC, Canada},
  Month = {November}