Finding the optimal number of clusters has remained to be a challenging problem in data mining research community. Several approaches have been suggested which include evolutionary computation techniques like genetic ...Finding the optimal number of clusters has remained to be a challenging problem in data mining research community. Several approaches have been suggested which include evolutionary computation techniques like genetic algorithm, particle swarm optimization, differential evolution etc. for addressing this issue. Many variants of the hybridization of these approaches also have been tried by researchers. However, the number of optimal clusters and the computational efficiency has still remained open for further research. In this paper, a new optimization technique known as “Teaching-Learning-Based Optimization” (TLBO) is implemented for automatic clustering of large unlabeled data sets. In contrast to most of the existing clustering techniques, the proposed algorithm requires no prior knowledge of the data to be classified rather it determines the optimal number of partitions of the data “on the run”. The new AUTO-TLBO algorithms are evaluated on benchmark datasets (collected from UCI machine repository) and performance comparisons are made with some well-known clustering algorithms. Results show that AUTO-TLBO clustering techniques have much potential in terms of comparative results and time of computations.展开更多
文摘Finding the optimal number of clusters has remained to be a challenging problem in data mining research community. Several approaches have been suggested which include evolutionary computation techniques like genetic algorithm, particle swarm optimization, differential evolution etc. for addressing this issue. Many variants of the hybridization of these approaches also have been tried by researchers. However, the number of optimal clusters and the computational efficiency has still remained open for further research. In this paper, a new optimization technique known as “Teaching-Learning-Based Optimization” (TLBO) is implemented for automatic clustering of large unlabeled data sets. In contrast to most of the existing clustering techniques, the proposed algorithm requires no prior knowledge of the data to be classified rather it determines the optimal number of partitions of the data “on the run”. The new AUTO-TLBO algorithms are evaluated on benchmark datasets (collected from UCI machine repository) and performance comparisons are made with some well-known clustering algorithms. Results show that AUTO-TLBO clustering techniques have much potential in terms of comparative results and time of computations.