期刊文献+

基于互信息最大化和特征聚类的特征选择 被引量:1

Feature Selection Based on Mutual Information Maximization and Feature Clustering
下载PDF
导出
摘要 提出一种互信息最大化和特征聚类相结合的特征选择法,并将其应用于邮件识别。通过互信息最大化从原始特征空间中选择次优特征子集,借助于特征空间的聚类来剔除冗余特征,从而实现特征空间的再次降维。实验结果表明该方法是一种有效的特征选择法。 Proposes a feature selection method based on mutual information maximization and feature clustering and applies to mail recognition.Suboptimal feature subset is selected from original feature space through mutual information maximization and then redundant features are removed with the clustering feature space to achieve reduction of the number of features again. Experimental results show that the method is an effective method of feature selection.
作者 张成彬 唐建
出处 《现代计算机》 2009年第8期31-33,共3页 Modern Computer
关键词 互信息最大化 特征聚类 邮件识别 Mutual Information Maximization Feature Clustering Mail Recognition
  • 相关文献

参考文献8

二级参考文献43

  • 1李莹,张晓辉,王华勇,常桂然.一种应用向量聚合技术的KNN中文文本分类方法[J].小型微型计算机系统,2004,25(6):993-996. 被引量:13
  • 2黎铭,薛晓冰,周志华.基于多示例学习的中文Web目录页面推荐[J].软件学报,2004,15(9):1328-1335. 被引量:17
  • 3胡佳妮,徐蔚然,郭军,邓伟洪.中文文本分类中的特征选择算法研究[J].光通信研究,2005(3):44-46. 被引量:47
  • 4王斌,许洪波,王申.基于结构特征的nBayes双层过滤模型[J].计算机应用,2006,26(1):191-194. 被引量:4
  • 5黄昌宁 等.对自动分词的反思[A]..语言计算与基于内容的文本处理[C].北京:清华大学出版社,2003,7.26-38.
  • 6Yang Yiming,Pedersen J O.A comparative study on feature selection in text categorization[C]//Proc of the 14th International Conference on Machine Learning ICML97,1997:412-420.
  • 7Karypis G,Han E.Fast supervised dimensionality reduction algorithm with applications to document categorization and retrieval[C]// Proc of the 9th ACM International Conference on Information and Knowledge Management CIKM-00.New York,US:ACM Press,2000: 228-233.
  • 8Baker L D,McCallum A K.Distributional clustering of words for text classification[C]//Proc of the 21st Annual International ACM SIGIR, 1998 :96-103.
  • 9谭松波语料库[DB/OL].http://lcc.software.ict.ac.cn/-tansongbo/corpusl.php.
  • 10Jolliffe I T.Principal component analysis[M].New York:Spriger Verlag, 1986.

共引文献317

同被引文献21

  • 1叶菲,罗景青,俞志富.一种改进的并行处理SVM学习算法[J].微电子学与计算机,2009,26(2):40-43. 被引量:6
  • 22008 NIPS UCINET & NetDraw Workshop. [ 2009 -08 -20 ]. ht- tp://www, hks. harvard, edu/netgov/files/NIPS/Halgin_NIPS_ 2008. pdf.
  • 3Giere W, Dettmer H. Free text classification and retrieval based on a thesaurus : Eight years of experience at the johann-wolfgang-goethe univer-sity, medical school//Proceedings-The Tenth Annual Sym- posium on Computer Applicationsin Medical Care. New York:IEEE, 1986:85-88.
  • 4Joachims T. Training linear SVMs in linear time//Proceedings of the ACM SIGKDD International Conference on Knowledge Discov- ery and Data Mining. New York:ACM, 2006 : 217 - 226.
  • 5Leo B. Bagging predictors. Machine Learning, 1996, 24 (2) :123 - 140.
  • 6Jeffrey D, Sanjay G. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008,51 ( 1 ) : 107 - 113.
  • 7Fili P G, Sampo P, Jorma B, et al. Ontology-based feature transformations: A data-driven approach. Lecture Notes in Artificial Intelligence, 2004 : 279 - 290.
  • 8David D L W, Bruce C. Term clustering of syntactic phrases//Proceedings of the 13th International Conference on Research and Development in Information Retrieval-SIGIR ' 90. New York : ACM, 1990 : 385 - 404.
  • 9Abdelwahab A, Sekiya H, Matsuba I, et al. An efficient collaborative filtering algorithm using SVD-free latent semantic indexing and particle swarm optimization//2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009. Piscataway:IEEE Computer Society, 2009.
  • 10Yang Y, Pedersen J O. A Comparative Study on Feature Selection in Text Categorization//Proceedings of the 14th International Conference on Machine Learning, ICML 1997. San Francisco: Morgan Kaufmann, 1997:412 -420.

引证文献1

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部