期刊文献+

聚类模式下一种优化的K-means文本特征选择 被引量:12

Clustering-based Improved K-means Text Feature Selection
下载PDF
导出
摘要 文本特征降维是文本自动分类的核心技术。K-means方法是一种常用的基于划分的方法。针对该算法对类中心初始值及孤立点过于敏感的问题,提出了一种改进的K-means算法用于文本特征选择。通过优化初始类中心的选择模式及对孤立点的剔除,改善了文本特征聚类的效果。随后的文本分类试验表明,提出的改进K-means算法具有较好的特征选择能力,文本分类的效率较高。 Text feature reduction is the key technology in text categorization.In addition,K-means is an partitioning method which usually be used.With regards to this arithmetic excessively incentive to the initial centers and the isolated points,the improved K-means arithmetic was put forward which is used in text feature selection.Text feature clustering was improved by optimizing primitive class center's options and the elimination of isolated point.Following text classification test shows that the K-means arithmetic put forward in this paper has a good feature selection ability and high efficiency in text categorization.
出处 《计算机科学》 CSCD 北大核心 2011年第1期195-197,共3页 Computer Science
基金 国家自然科学基金项目(编号:70571087)资助
关键词 特征选择 聚类 K均值 文本分类 Feature selection Clustering K-means Text categorization
  • 相关文献

参考文献10

二级参考文献37

共引文献414

同被引文献91

  • 1任喜伟,任工昌,杨帆.电磁场式油水界面测量分析及数据优化方法[J].化工自动化及仪表,2012,39(7):858-861. 被引量:8
  • 2姚建民,周明,赵铁军,李生.基于句子相似度的机器翻译评价方法及其有效性分析[J].计算机研究与发展,2004,41(7):1258-1265. 被引量:17
  • 3张明波,陆锋,申排伟,程昌秀.R树家族的演变和发展[J].计算机学报,2005,28(3):289-300. 被引量:94
  • 4Dernoncourt D. Analysis of feature selection stability on high dimension and small sample data[J]. Computational Statis tics and Data Analysis, 2014, 71(3):681-693.
  • 5SinaT, Parham M, Fardin A. An unsupervised feature selec tion algorithm based on ant colony optimization[J]. Engineer ing Applications of Artificial intelligence, 2014, 32(6): 112-123.
  • 6Salwani A. An exponential Monte-Carlo algorithm for lea ture selection problems[J]. Computers and Industrial Engi neering, 2014, 67(1): 160-167.
  • 7Wu X. Online feature selection with streaming features[J].IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2013, 35(5): 1178-1192.
  • 8Han J, Kamber M. Date Mining: Comcepts and Techniques [M].北京:机械工程出版社,2001.
  • 9LEE S S,Lin J C. An accelerated K-means clustering algo- rithm selction and erasure rules[J]. Zhejiang University- SCIENCE C Computers Electronics, 2012,13 (10) 761-768.
  • 10LEE S S, LIN Jachen. An accelerated K-means clusteringalgorithm selction and erasure rules[ J]. Zhejiang Univer-sity-SCIENCE C ( Computers Electronics), 2012,13(10); 761-768.

引证文献12

二级引证文献55

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部