摘要
顺序聚类算法是一种非常直接和快速的算法,并且不需要提前确定聚类个数。但是当处理海量数据时,时间效率仍然有待提高。TTSAS算法是两个阈值的顺序聚类算法,在此基础上,该文应用三角不等式原理提出了TI_TTSAS算法,该算法避免了冗余的距离计算,实验结果证明,相对于TTSAS算法,TI_TTSAS在速度上有很大程度的提高,数据规模越大,改进效果越明显。并且聚类效果保持了TTSAS算法的准确性。
Sequential algorithm is a straightforward cluster algorithm, and people do not have to provide the number of clusters in advance. However, when faced with large-scale data. the efficiency of the algorithm has need to be improved. Based on two-threshold sequential algorithm scheme(TTSAS), this article presents a new sequential algorithm TI TTSAS. which avoids unnecessary distance calculations by applying the triangle inequality. Experiments show that the new algorithm is more effective for datasets of more dimensions, and becomes more and more effective as the number of clusters increases. The results keeps the accuracy of TTSAS algorithm.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2006年第17期97-99,125,共4页
Computer Engineering
基金
甘肃省自然科学基金资助项目(3ZS051-A25-035)
甘肃省气象局创新基金资助项目(2005)