Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data

No Thumbnail Available

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

The application of algorithms based on data analytics for the task of knowledge mining in a student dataset is an important strategy for improving learning outcomes, student success and supporting strategic decision making in higher educa�tional institutions of learning. However, the widely used data analytics based clustering algorithms are highly data dependent, making it pertinent to find the most effective algorithm for knowledge mining in a dataset associated with student engage�ment. In this study, performances of five famous clustering algorithms are evaluated for this purpose. The k-means algorithm was benchmarked with 22 distance functions based on the Silhouette index, Dunn’s index and partition entropy internal valid�ity metrics. The hierarchical clustering algorithm was benchmarked with the Cophenetic correlation coefficient computed for different combinations of distance and linkage functions. The Fuzzy c-means algorithm was benchmarked with the partition entropy, partition coefficient, Silhouette index and modified partition coefficient. The k-nearest neighbor algorithm was applied to determine the optimum epsilon value for the density-based spatial clustering of applications with noise. The default param�eter settings were accepted for the expectation-maximization algorithm. The overall ranking of the clustering algorithms was based on cluster potentiality using the median deviation statistics. The results of the evaluation show the well-known k-means algorithm to have the highest cluster potentiality, demonstrating its effectiveness for the task of knowledge mining in a student engagement dataset

Keywords

Q Science (General)

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By