摘要
如何从样本量大、数据结构复杂、离散度大的样本数据中提取有效的特征数据是模式识别的重点和难点,而ISODATA算法是处理大样本数据聚类的常用算法之一,其不足之处是需要预先确定初始聚类参数.提出了基于黄金分割法来度量聚类的有效性,该方法能动态计算聚类度量参数,可实现大样本数据的有效聚类.实验证明,该方法能够从原始特征中挑选出最有代表性、分类性能最好的特征.
How to extract effective feature data from large samples, complex structures and dis persion data is the key and difficulty of the pattern recognition,the ISODATA algorithm is one of the common algorithm of large samples data clustering, whereas the inadequacies of the algorithm are need to predetermine initial cluster parameters. An improved method based on the golden sec tion method is proposed to measure the effectiveness of clustering, which can dynamically calculate the clustering metrics, and achieve effective clustering of large sample data. The results show that the method can select the most representative and best characteristic features from the original large sample data.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2013年第1期93-96,共4页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(50901039)
关键词
ISODATA
大样本
黄金分割法
特征提取
ISODATA
large sample data
golden section method
characteristics extraction