摘要
针对聚类中的特征选择问题,提出一种基于特征语义权重的数据聚类方法。该方法由用户指定必需的特征集,通过计算特征之间的语义相关度,选择和指定特征集相关的特征集作为补充。利用语义相关度确定各个特征的语义权重,在特征语义权重计算的基础上对传统的K-Means聚类算法进行改进,提出具有特征语义权重的FSW-KMeans算法。实验结果表明,FSW-KMeans算法较大地提高了聚类算法准确率和效率。
This paper proposes a data clustering method based on feature semantic weight for feature selection in clustering. The method acquires Must-Link set from user, and chooses the features which are relevant to the Must-Link as a supplement by calculating the semantic relativity and calculates feature semantic weight by the semantic relativity. It improves the traditional K-Means clustering algorithm based on the calculation of semantic relativity and presents FSW-KMeans clustering algorithm with feature semantics weight. Experimental results show that the clustering accuracy and efficiency of FSW-KMeans algorithm are improved.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第4期64-66,共3页
Computer Engineering
基金
国家自然科学基金资助项目(50674086)
江苏省社会发展科技计划基金资助项目(BS2006002)
高等学校博士学科点专项科研基金资助项目(20060290508)
中国矿业大学校基金资助项目(0D090229)