Publication Date
Spring 2022
Advisor(s) - Committee Chair
Qui Li (Director), Guangming Xing, Zhonghang Xia
Degree Program
School of Engineering and Applied Sciences
Degree Type
Master of Science
Abstract
Clustering is an important topic in data modeling. K-means Clustering is a well-known partitional clustering algorithm, where a dataset is separated into groups sharing similar properties. Clustering an unbalanced dataset is a challenging problem in data modeling, where some group has a much larger number of data points than others. When a K-means clustering algorithm with Euclidean distance is applied to such data, the algorithm fails to form good clusters. The standard K-means tends to split data into smaller clusters during a clustering process evenly.
We propose a new K-means clustering algorithm to overcome the disadvantage by introducing a different distance metric. The new metric is ignited by the Newton universal law of gravity, where a smaller mass object is moved towards the larger mass object. Experiment results show the effectiveness of the new metric with visual comparison to Euclidean distance. Furthermore, quantitative comparisons using Davies-Bouldin Index also show the superiority of the new metric.
Disciplines
Computer Engineering | Computer Sciences
Recommended Citation
Indulkar, Ajinkya Vishwas, "K-Means Clustering Using Gravity Distance" (2022). Masters Theses & Specialist Projects. Paper 3580.
https://digitalcommons.wku.edu/theses/3580