期刊文献+

基于论域划分的无监督文本特征选择方法 被引量:2

Unsupervised Text Feature Selection Method Based on Domain Division
下载PDF
导出
摘要 由于缺乏类信息,使得无监督文本特征选择问题一直未较好地加以解决。为此,对该问题进行了研究并提出了一个基于论域划分的无监督文本特征选择。该方法主要是把论域划分的思想引入到无监督文本特征选择之中,其首先使用一种新型无监督文档进行文本特征初选以过滤低频的噪声词,然后再使用所给的基于论域划分的属性约简进行文本特征优选。实验结果表明这个方法能够克服文本聚类时缺乏类的先验知识的不足,可以较好地解决无监督文本特征选择问题。 Due to the lack of class labels, unsupervised text feature selection problem hasnt been resolved ef- fectively. Therefore, this problem was studied and an unsupervised text feature selection method based on domain division was proposed. This method mainly makes use of supervised text feature selection doing unsupervised text feature selection. Firstly those low-frequency noise words are flihered by using a new wnsupervised document to make the text characteristics primary election, and then employs a presented attribute reduction based on domain division for text feature optimization. The experimental results show that this method can overcome the clustering flaw which lacks of transcendent knowledge and solve unsut)ervised text feature selection nrob]em well.
出处 《科学技术与工程》 北大核心 2013年第7期1836-1839,共4页 Science Technology and Engineering
基金 国家自然科学基金(61201447) 河南省基础与前沿技术研究计划项目(102300410266 122300410287) 郑州市科技计划项目(121PPTGG362-12) 郑州轻工业学院博士科研基金(2010BSJJ038)资助
关键词 文本聚类 特征选择 文档频 论域划分 text clustering feature selection document frequency domain division
  • 相关文献

参考文献9

  • 1Liu Wei, Wong Wilson. Web service clustering using text mining techniques. International Journal of Agent-Oriented Software Engi- neering,2009;3(1) : 6-26.
  • 2Gheyas I A, Smith L S. Feature subset selection in large dimensional- ity domains. Pattern Recognition, 2010 ; 43 ( 1 ) : 5-13.
  • 3Zhu Hao-Dong, Zhao Xiang-Hui, Zhong Yong. Feature selection method combined optimized documenffrequency with improved RBF network. Proc of 5th International Conference on ADMA 2009, Bei- jing: China, 2009:796-803.
  • 4Wilbur J W, Sirotkin K. The automatic identification ofstop words. Journal of Information Science, 1992;18( 1 ) :45-55.
  • 5Dash M, Liu H. Feature selection for clustering. Proc of PAKDD, USA : New York ,2000 : 110-121.
  • 6Tao L, Shengping L. An evaluation on feature selection fortext cluste- ring. The ICML03. Washington, 2003; 20(3) :53-58.
  • 7朱颢东,李红婵,钟勇.新颖的无监督特征选择方法[J].电子科技大学学报,2010,39(3):412-415. 被引量:4
  • 8倪子伟,蔡经球.离散数学.北京:科学出版社,2002.10.
  • 9朱颢东,钟勇.一种无决策属性的信息系统的属性约简算法[J].小型微型计算机系统,2010,31(2):360-362. 被引量:2

二级参考文献4

共引文献4

同被引文献20

  • 1毛国君.数据挖掘原理与算法[M].北京:清华大学出版社,2007.
  • 2Yang C C, Tobun D N. Analyzing and visualizing Web opin- ion development and social interactions with density-based clustering[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part A : Systems and Humans, 2011, 41 (6) : 1144-1155.
  • 3Dernoncourt D. Analysis of feature selection stability on high dimension and small sample data[J]. Computational Statis- tics and Data Analysis, 2014, 71(6): 681-693.
  • 4Sina T. An unsupervised feature selection algorithm based on ant colony optimization[J]. Engineering Applications of Artificialintelligence, 2014, 32(1): 112-123.
  • 5Salwanf A. An exponentfal Monte-carlo algorithm for feature selection problems[J]. Computers and Industrial Engineer- ing, 2014, 67(1): 160- 167.
  • 6Wu X. Online feature selection with streaming features[J]. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2013, 35(5): 1178-1192.
  • 7HanJ,KamberM.DataMining:ConceptsandTechniques[M].北京:机械工程出版社,2001.
  • 8Dernoncourt D. Analysis of feature selection stability on high dimension and small sample data[J]. Computational Statis tics and Data Analysis, 2014, 71(3):681-693.
  • 9SinaT, Parham M, Fardin A. An unsupervised feature selec tion algorithm based on ant colony optimization[J]. Engineer ing Applications of Artificial intelligence, 2014, 32(6): 112-123.
  • 10Salwani A. An exponential Monte-Carlo algorithm for lea ture selection problems[J]. Computers and Industrial Engi neering, 2014, 67(1): 160-167.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部