期刊文献+

基于云模型的文本特征自动提取算法 被引量:4

Text feature automatic selection algorithm based on cloud model
下载PDF
导出
摘要 在综合考虑特征整体与局部分布基础上,提出一种高性能的文本特征自动提取算法。算法引入云隶属度概念对特征分布进行修正,不需任何先验知识,能根据特征分布特点自动获取云隶属度高的特征集。实验结果表明:该特征集具有特征个数少、分类精度高的特点,性能明显比当前主要的特征选择方法的性能优。 Combining the overall with the local distribution of features in categories,a high performance algorithm for feature automation selection(Named FAS) was proposed.By using FAS,the feature set was obtained automatically and the distribution of features was amended by using cloud model theory.The results show the selected feature set has fewer features and better classification performance than the existing methods.
出处 《中南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第3期714-720,共7页 Journal of Central South University:Science and Technology
基金 国家重大科技专项子课题(2008ZX07315-001) 重庆市重大科技专项(2008AB5038) 中央高校基本科研业务资助项目(CDJXS11181160)
关键词 文本分类 特征提取 云模型 隶属度 动态聚类 text classification feature selection cloud model membership degree dynamic clustering
  • 相关文献

参考文献26

  • 1Charles-Antoine J, John E, France B. Controlled user evaluations of information visualization interfaces for text retrieval:literahare review and meta-analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(6): 1012-1024.
  • 2Haruechaivasak, Choochart J, Wittawat S. Implementing news article category browsing based on text categorization technique[C]// Proc of Web Intelligence and Intelligent Agent Technology (WI-IAT 2008). Piscataway: IEEE, 2008: 143-146.
  • 3Myunggwon H, Chang C, Byungsu Y, et al. Word sense disambiguation based on relation structure[C]// Proc of Advanced Language Processing and Web Information Technology (ALPIT 2008). Piscataway: IEEE, 2008: 15-20.
  • 4Xuemi W, Mccallum A, Xing W. Topical n-grams: phrase and topic discovery, with and application to information retrieval[C]//7th IEEE International Conference on Data Mining (ICDM 2007). Piscataway: IEEE, 2007: 697-702.
  • 5Selvakuberan K, Indradevi M, Rajaram R. Combined feature selection and classification: A novel approach for the categorization of web pages[J]. Journal of Information and Computing Science, 2008, 3(2): 83-89.
  • 6苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:386
  • 7Yang Y M, Pedersen J O. A comparative study on feature selection in text categorization[C]//Proc of the 14th International Conference on Machine Learning (ICML 1997). San Francisco: MIT Press, 1997: 412-420.
  • 8Jana N, Petr S, Michal H. Conditional mutual information based feature selection for classification task[C]// Proc of the 12th Iberoamericann Congress on Pattern Recognition (CIAPR 2007). Berlin: Springer-Verlag, 2007:417-426.
  • 9Santana L E A, de Oliveira D F, Canuto A M P, et al. A comparative analysis of feature selection methods for ensembles with different combination methods[C]// Proc of Internation Joint Conference on Neural Networks (IJCNN 2007). Piscataway: IEEE, 2007: 643-648.
  • 10Forman G. An extensive empirical study of feature selection metrics for text classification[J]. Journal of Machine Learning Research, 2003, 3(1): 1533-7928.

二级参考文献46

共引文献1826

同被引文献47

引证文献4

二级引证文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部