Enriching Code Coverage with Test Characteristics

Authors - Shivashree Vysali
Venue - McGill University, pp. 1-81, 2020

Abstract - Code coverage measures the degree to which source code elements (e.g., statements, branches) are invoked during testing. Despite growing evidence that coverage is a problematic measurement, it is often used to make decisions about where testing effort should be invested. For example, using coverage as a guide, tests should be written to invoke the non-covered program elements. At their core, coverage measurements assume that invocation of a program element during any test is equally valuable and only provide a binary covered-or-not classification of program elements. Yet in reality, tests have varied characteristics and coverage can be enriched by incorporating these test characteristics. In this thesis, we expand code coverage classification by adding scope (e.g., unit, function) and reliability (flaky vs. robust) characteristics of the tests to the coverage report. We perform an empirical study of three large software systems from the OpenStack community, namely, Nova, Neutron, and Cinder.

We generate an enriched statement coverage report and glean additional insights. We observe that 60.94% and 63.33% of statements are covered by both unit and functional tests in Neutron and Nova, respectively, while only 30% are covered by both types of tests in Cinder. We find that systems are disproportionately impacted by flakily covered statements with 5% and 10% of the covered statements in Nova and Neutron being flakily covered, respectively, while <1% of Cinder statements are flakily covered. We also find that incidences of flakily covered statements could not be well explained by solely using code characteristics, such as dispersion, ownership, and development activity. In order to understand the cost effectiveness of enriching code coverage, we propose GreedyFlake – a test effort prioritization algorithm to maximize return on investment when tackling the problem of flakily covered program elements. We find that GreedyFlake outperforms baseline approaches by at least eight percentage points of Area Under the Cost Effectiveness Curve.

Preprint - PDF

Bibtex

@mastersthesis{vysali2020masters,
  Author = {Shivashree Vysali},
  Title = {{Enriching Code Coverage with Test Characteristics}},
  Year = {2020},
  School = {McGill University},
  Address = {3480 Rue University, Montréal, QC, Canada},
  Month = {December}
}