Abstract - Although decentralized Version Control Systems (VCSs) like Git support several organizational structures, a central copy of the repository is typically where development activity is coalesced and where official software releases are cut. Popular practices like trunk-based development and monolithic repositories (a.k.a., "monorepos") that span entire organizations strain central repositories. Remedial actions, such as performing garbage collection routines, can backfire because they are computationally expensive and if run at an inopportune moment, may degrade repository performance or cause the host to crash.
In this paper, we propose a reinforcement learning agent that can take remedial actions to sustain VCS performance. Since volumes of VCS activity are needed to train the agent, we first augment the VCS to enable a greater throughput, observing that the augmented VCS outperforms the stock VCS to a large, statistically significant degree. Then, we compare the performance that a VCS can sustain when the agent is applied against a schedule-based garbage collection policy and a no-action baseline, observing 64 to 82-fold improvements in the Area Under the Curve (AUC) that plots repository performance over time. This paper takes a promising first step towards automatically sustaining VCS performance under heavy workloads.
Preprint - PDF
Bibtex