"High-Dimensional Software Engineering Data and Feature Selection" by Huanjing Wang, Taghi M. Khoshgoftaar et al.

Computer Science Faculty Publications

Title

High-Dimensional Software Engineering Data and Feature Selection

Authors

Huanjing Wang, Western Kentucky UniversityFollow
Taghi M. Khoshgoftaar, Florida Atlantic UniversityFollow
kehan Gao, Eastern Connecticut State University

Abstract

Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set of software metrics that have the same predictive capability as a larger set of metrics – we strive to answer that question in this paper. We present a comprehensive comparison between seven commonly-used filter-based feature ranking techniques (FRT) and our proposed hybrid feature selection (HFS) technique. Our case study consists of a very highdimensional (42 software attributes) software measurement data set obtained from a large telecommunications system. The empirical analysis indicates that HFS performs better than FRT; however, the Kolmogorov-Smirnov feature ranking technique demonstrates competitive performance. For the telecommunications system, it is found that only 10% of the software attributes are sufficient for effective software quality prediction.

Disciplines

Artificial Intelligence and Robotics | Databases and Information Systems | Other Computer Sciences

Recommended Repository Citation

Wang, Huanjing; Khoshgoftaar, Taghi M.; and Gao, kehan. (2009). High-Dimensional Software Engineering Data and Feature Selection. 2009 21st IEEE International Conference on Tools with Artificial Intelligence, 83-90.
Available at: https://digitalcommons.wku.edu/comp_sci/3

Download

Included in

Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Other Computer Sciences Commons

COinS

TopSCHOLAR®