期刊文献+

CTM与SVM相结合的文本分类方法 被引量:7

Text Classification Method Combining CTM and SVM
下载PDF
导出
摘要 研究一种相关主题模型(CTM)与支持向量机(SVM)相结合的文本分类方法。该方法用CTM对数据集建模以降低数据的维度,用SVM对简化后的文本数据进行分类。为使CTM模型能够较好地对数据集进行建模,在该方法中用DBSCAN聚类方法对数据进行聚类,根据聚类所得到的聚类中心点数目确定CTM模型的主题参数。实验结果表明,该方法可以加快分类速度并提高分类精度。 A text classification method combining Correlated Topic Model(CTM) and Support Vector Machine(SVM) is proposed. In order to reduce the corpus's dimension, this method models the corpus, and classifies the simplified text date with SVM. With the aim of making the CTM model the corpus better, DBSCAN clustering method is used and chooses the cluster number as the model topic parameter of CTM. Experimental result shows that the method can accelerate the classification speed and improve the classification accuracy.
作者 王燕霞 邓伟
出处 《计算机工程》 CAS CSCD 北大核心 2010年第22期203-205,共3页 Computer Engineering
关键词 文本分类 相关主题模型 聚类 支持向量机 text classification Correlated Topic ModeI(CTM) clustering SVM
  • 相关文献

参考文献6

  • 1Aas K, Eikvil L. Text Categorization: A Survey[R]. Norway, Oslo: Norwegian Computing Center, Tech. Rep.: 114, 1999.
  • 2Blei D M. Latent Dirichlet Dirichlet[J]. Journal of Machine Learning Research, 2003, (3): 993-1022.
  • 3Lafferty J B D. Correlated Topic Models[C]//Proc. of Neural Information Processing Systems Conference. Cambridge, MA, USA: MIT Press, 2006.
  • 4CristianiniN Shawe-TaylorJ 李国正译.支持向量机导论[M].北京:电子工业出版社,2004..
  • 5王祖辉,姜维.基于支持向量机的垃圾邮件过滤方法[J].计算机工程,2009,35(13):188-189. 被引量:7
  • 6Ester M, Kriegel H P, Sander J, et al. A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise[C]// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, USA: AAAI Press, 1996.

二级参考文献5

  • 1任禾,曾隽芳.一种基于信息熵的中文高频词抽取算法[J].中文信息学报,2006,20(5):40-43. 被引量:22
  • 2姜维,王晓龙,关毅,赵健.基于多知识源的中文词法分析系统[J].计算机学报,2007,30(1):137-145. 被引量:29
  • 3Gim(e)nez J,M(a)rquez L.SVMTool:A General POS Tagger Generator Based on Support Vector Machines[C]//Proceedings of the 4th International Conference on Language Resources and Evaluation.Lisbon,Portugal:[s.n.],2004.
  • 4Pang Xiuli,Feng Yuqiang,Jiang Wei.A Chinese Anti-spam Filter Approach Based on Support Vector Machine[C]//Proceedings of International Conference on Management Science & Engineering.[S.l.]:IEEE Press,2007.
  • 5Joachims T.Text Categorization with Support Vector Machines:Learning with Many Relevant Features[C]//Proc.of the 10th European Conference on Machine Learning.Chemnitz,Germany:[s.n.],1998.

共引文献116

同被引文献45

  • 1祁亨年,杨建刚,方陆明.基于多类支持向量机的遥感图像分类及其半监督式改进策略[J].复旦学报(自然科学版),2004,43(5):781-784. 被引量:14
  • 2马勇,丁晓青.Real-Time Multi-View Face Detection and Pose Estimation Based on Cost-Sensitive AdaBoost[J].Tsinghua Science and Technology,2005,10(2):152-157. 被引量:4
  • 3郑小霞,钱锋.高斯核支持向量机分类和模型参数选择研究[J].计算机工程与应用,2006,42(1):77-79. 被引量:39
  • 4Lee Yong-Bae,Hyon M.Text Genre Classification with Genre-revealing and Subject-revealing Features[C]//Proc.of the 25th Annual Int’l Conf.on Research and Development in Information Retrieval.Tampere,Finland:[s.n.],2002:327-331.
  • 5Aidan F,Nicholas K.Learning to Classify Documents According to Genre[J].Journal of the American Society for Information Science and Technology,2006,57(11):1506-1518.
  • 6Huang Chang,Ai Haizhou,Li Yuan,et al.High-performance Rotation Invariant Multi-view Face Detection[J].IEEE Trans.on Pattern Analysis and Machine Intelligence,2007,29(4):671-686.
  • 7Li Ling.Data Complexity in Machine Learning and Novel Classification Algorithms[D].Pasadena,California,USA:California Institute of Technology,2006.
  • 8Vapnik V N. The Nature of Statistical Learning Theory[M]. New York: Springer, 1995.
  • 9Rilling G, Flandrin P. On the Influence of Sampling on the Empirical Mode Decomposition[C]. Proc 2006 IEEE Inter- natio!lal Conference on Acoustics, Speech and Signal Pro- cessing( ICA-SSP 2006 ),Toulouse, France, 2006,3: %44-447.
  • 10Weston J, Watldns C. Multi-class Support Vector Ma- chines[R]. Royal Holloway College: SCD-TR-98-04,1998. sd.

引证文献7

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部