Abstract - Software repository mining techniques can provide insights about software systems and their development processes through the use of metrics that aim to capture a construct of interest. However, linking development history metrics with high-level constructs is fraught with threats to validity. We conducted a case study in which we performed a critical review of the underlying artifacts used to compute a metric of knowledge at risk in software projects, proposed in prior work. The case study revealed eight major threats to validity that have the potential to generalize to other software process metrics derived from repository data. In addition to a detailed description of each threat, we contribute a questionnaire to facilitate their assessment in past and future studies.
Preprint - PDF
Bibtex