期刊文献+

基于语义空间的支持向量机的文本过滤 被引量:3

Text filtering based on support vector machine of semantic space
下载PDF
导出
摘要 传统的基于支持向量机的文本过滤,用向量空间模型来表示文本和用户模板,向量空间模型假设特征项之间是线性无关的,该假设引入了许多因具体用词变化不定而带来的词汇噪音信息,影响了基于支持向量机的文本过滤的过滤性能。提出基于语义空间的支持向量机的文本过滤,用语义来表示文本和用户模板。该方法主要通过奇异值分解提取文本的潜在语义空间,在语义空间上训练支持向量机得到用户模板和过滤阈值,文本流上的文本映射到语义空间上,在语义空间上计算用户模板和新文本的相似度。实验表明:该方法的过滤性能可以达到 98. 67%。 Traditionally, text filtering based on support vector machine uses the vector space model to represent the text and user profile. Vector space model draws the noise into the system because it assumes that the word in the text is independent and it influences the performance of the filtering. The proposed method was based on vector support machine of semantic space in which text and user profile were represented by the semantic space. The proposed approach used the singular-value decomposition to derive a latent semantic space. User profile and filtering threshold could been got by training the support vector machine in the semantic space. And the similarity between the user profile and new text was computed by cosine measure, after the new text was mapped into the semantic space. Experimental results show that the filtering rate of our approach can get 98.67%.
出处 《计算机应用》 CSCD 北大核心 2005年第3期664-665,共2页 journal of Computer Applications
基金 福建省科技计划重点资助项目(001J005)
关键词 文本过滤 奇异值分解 支持向量机 语义空间 text filtering singular value decomposition support vector machine semantic space
  • 引文网络
  • 相关文献

参考文献5

  • 1VapnikVN.统计学习理论的本质[M].北京:清华大学出版社,2000..
  • 2BELKIN N, CROFT BW. Information filtering and information retrieval: two sides of the same coin?[J]. Communications of the ACM, 1992,35(12):29-38.
  • 3JOACHIMS T. Text Categorization with Support Vector Machines: learning with many relevant features[A]. Proceedings 10th European Conference on Machine Learning[C], 1998.137-142.
  • 4LEOPOLD E, KINDERMAN J. Text Categorization with Support Vector Machines, How to Represent Texts in Input Space?[J]. Machine Learning,1998, 46(1-3):423-444.
  • 5DEERWESTER S, DUMAIS ST, FURNAS GW, et al. Indexing by Latent Semantic Analysis[J]. Journal of the American Society for Information Science, 1990, 41(6): 391-407.

共引文献170

同被引文献23

引证文献3

二级引证文献14

;
使用帮助 返回顶部