Estimating Software Project Performance Using Factor Analysis and Sequential Equation Modelling
Abstract
Project performance is a critical aspect of the software development process, or any professional work cycle. However, evaluating software projects using source code metrics has been a challenging task for project managers over the years. In this paper we propose a statistical based pipeline for converting original low-level software project source code metrics, such as commit count, into human-readable, interpretable latent factors, such as code quality, to evaluate performance. This study uses a dynamic dataset, consisting of 1,000 completed and ongoing software projects from Github. Also, structural equation modeling (SEM) and Factor Analysis (FA) are used to extract meaning from the software projects metrics. Our SEM analysis yielded a $\mathbf{p}$-value of less than $10^{-7}$, and the Eigenvalue versus Factors scree plot analysis for FA provided supporting evidence for the selection of underlying factors. The performance indicators found using the proposed approach are namely code quality, software developers motivation and software repository maintainability. Finally, this study demonstrates an easy interpretation of the results gathered from the proposed method, where a potential software team leader can look back and derive valuable insight into ongoing projects and evaluate the appropriate steps to be taken.