One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this pap...One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this paper we propose a possible and non-automatic solution considering different criteria of clustering and comparing their results. In this way robust structures of an analyzed dataset can be often caught (or established) and an optimal cluster configuration, which presents a meaningful association, may be defined. In particular, we also focus on the variables which may be used in cluster analysis. In fact, variables which contain little clustering information can cause misleading and not-robustness results. Therefore, three algorithms are employed in this study: K-means partitioning methods, Partitioning Around Medoids (PAM) and the Heuristic Identification of Noisy Variables (HINoV). The results are compared with robust methods ones.展开更多
文摘One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this paper we propose a possible and non-automatic solution considering different criteria of clustering and comparing their results. In this way robust structures of an analyzed dataset can be often caught (or established) and an optimal cluster configuration, which presents a meaningful association, may be defined. In particular, we also focus on the variables which may be used in cluster analysis. In fact, variables which contain little clustering information can cause misleading and not-robustness results. Therefore, three algorithms are employed in this study: K-means partitioning methods, Partitioning Around Medoids (PAM) and the Heuristic Identification of Noisy Variables (HINoV). The results are compared with robust methods ones.