"A Comparative Study of Threshold-based Feature Selection Techniques" by Huanjing Wang, Taghi M. Khoshgoftaar et al.

Computer Science Faculty Publications

Title

A Comparative Study of Threshold-based Feature Selection Techniques

Authors

Huanjing Wang, Western Kentucky UniversityFollow
Taghi M. Khoshgoftaar, Florida Atlantic UniversityFollow
Jason Van Hulse, Florida Atlantic University

Abstract

Abstract Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The experiments demonstrate that the choice of a performance metric may significantly influence the results. In this study, we have found four distinct patterns when utilizing eight performance metrics to order 11 threshold-based feature selection techniques. Moreover, performances of the software quality models either improve or remain unchanged despite the removal of over 96% of the software metrics (attributes).

Disciplines

Artificial Intelligence and Robotics | Databases and Information Systems | Other Computer Sciences

Recommended Repository Citation

Wang, Huanjing; Khoshgoftaar, Taghi M.; and Hulse, Jason Van. (2010). A Comparative Study of Threshold-based Feature Selection Techniques. Proceedings of 2010 IEEE International Conference on Granular Computing (GrC 2010).
Available at: https://digitalcommons.wku.edu/comp_sci/6

Download

Included in

Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Other Computer Sciences Commons

COinS

TopSCHOLAR®