期刊文献+

基于词语上下文关系的文本自动分类方法研究

Research on Automatic Classification Based on Term Context Relations
下载PDF
导出
摘要 用词上下文向量来表达文本集内一个词语与其他词语之间的上下文关系,并在词上下文向量的基础上生成分类器中所有类别的类别特征向量,以及待分类文本的特征向量,最后由分类器给出待分类文本的所属类别。实验显示,在类别特征向量和文本向量中融入词语上下文关系有助于改善文本分类效果。 In this paper, a term context vector is used to represent the relation between a term and its context terms. Based on term context vectors, class feature vectors of a classifier, and the document vector of the document to be classi-fied are generated, and then the document is classified. The experiment shows that adding term context relations into class feature vector and document vector can improve the classification effect.
作者 郭少友
出处 《现代图书情报技术》 CSSCI 北大核心 2008年第5期44-49,共6页 New Technology of Library and Information Service
关键词 文本自动分类 上下文 词上下文向量 Text automatic classification Context Term context vector
  • 相关文献

参考文献8

  • 1Wang Y. Incorporating Semantic and Syntactic Information into Document Representation for Document Clustering[D]. Mississippi : Mississippi State University ,2005.
  • 2Billhardt H, Borrajo D, Maojo V. Using Term Co - occurrence Data for Document Indexing and Retrieval [C].In:Proceedings of the BCSIRSG 22nd Annual Colloquium on Information Retrieval Research, 2000 : 105 - 117.
  • 3何中市,刘里.基于上下文关系的文本分类特征描述方法[J].计算机科学,2007,34(5):183-186. 被引量:6
  • 4孙晓霞,郑玉明,廖湖声.一种基于特征词句子环境的文本分类器[J].计算机应用研究,2007,24(2):116-119. 被引量:3
  • 5曾雪强,王明文,陈素芬.一种基于潜在语义结构的文本分类模型[J].华南理工大学学报(自然科学版),2004,32(z1):99-102. 被引量:27
  • 6Besancon R, Rajman M, Chappelier J C. Textual Similarities Based on a Distributional Approach [ C ]. The Tenth International Workshop on Database and Expert Systems Applications. Florence, Italy,1999:180 - 184.
  • 7Cai L J, Hofmann T. Text Categorization by Boosting Automatically Extracted Concepts [ EB/OL ], [ 2007 - 11 - 22 ]. http://www. iro. umontreal. ca/- kegl/ift3390/2006_1/Lectures/108_TextCategorizationCaiHofmann. pdf.
  • 8李荣陆.文本分类系统SVMCLS2.0[EB/OL].[2007-11-22].http://www.nip.org.cn/docs/docredirect.php?doc-id=1023.

二级参考文献19

  • 1谌志群,张国煊.文本挖掘研究进展[J].模式识别与人工智能,2005,18(1):65-74. 被引量:49
  • 2任纪生,王作英.基于特征有序对量化表示的文本分类方法[J].清华大学学报(自然科学版),2006,46(4):527-529. 被引量:4
  • 3[1]Sebastiani F. Machine learning in automated text categorization [J]. ACM Computing Survey, 2002,34 (1):1 -47.
  • 4[2]Deerwester S,Dumais S T,Furnas G W,et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990,41 (6) :391 - 407.
  • 5[3]Dumais S T. Using LSI for information filtering [A].Harman D. The Third Text Retrieval Conference ( TREC - 3) [C]. USA: National Institute of Standards and Technology Special Publication, 1995.
  • 6[4]Baker L D,McCallum A K. Distributional clustering of words for text classification [A]. Proc. ACM-SIGIR-98[C]. Australia: ACM Press, 1998. 96 - 103.
  • 7[5]Park H,Howland P,Jeon M. Cluster structure preserving dimension reduction based on the generalized singular value decompositon [J]. SIAM Journal on Matrix Analysis and Applications ,2003,25 (1): 165 - 179.
  • 8[6]Wold H. Encyclopedia of Statistical Science [M]. New York: Wiley, 1985.
  • 9[7]Tenenhaus M. La Régreesion PLS. Théorie et Pratique [M]. Paris: éditions Technip, 1998.
  • 10RichardODuda PeterEHart DavidGStork.模式分类[M].北京:机械工业出版社,2003.134-174.

共引文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部