Abstract
Feature selection has been applied in many domains, such as text mining and software engineering. Ideally a feature selection technique should produce consistent out- puts regardless of minor variations in the input data. Re- searchers have recently begun to examine the stability (robustness) of feature selection techniques. The stability of a feature selection method is defined as the degree of agreement between its outputs to randomly-selected subsets of the same input data. This study evaluated the stability of 11 threshold-based feature ranking techniques (rankers) when applied to 16 real-world software measurement datasets of different sizes. Experimental results demonstrate that AUC (Area Under the Receiver Operating Characteristic Curve) and PRC (Area Under the Precision-Recall Curve) per- formed best among the 11 rankers.
Disciplines
Computer Engineering | Computer Sciences | Engineering | Physical Sciences and Mathematics
Recommended Repository Citation
Wang, Huanjing and Khoshgoftaar, Taghi. (2011). Measuring Stability of Threshold-based Feature Selection Techniques. The 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2011).
Available at:
https://digitalcommons.wku.edu/comp_sci/9