期刊文献+

向量空间模型文本建模的语义增量化改进研究 被引量:6

Semantic Incremental Improvement on Vector Space Model for Text Modeling
原文传递
导出
摘要 【目的】基于语义增量对向量空间模型文本分类方法进行改进,并进行实验验证。【方法】梳理目前文本表示中语义向量引入和改进的相关研究,提出文本的语义向量表示实现框架。根据主题词和词汇分别与领域本体中概念之间的映射关系,构建概念层次树和定位词汇,计算概念语义相似度,结合语义增量实现文本的语义向量构建。【结果】通过文本分类的对比实验发现,本文所提方法可行且有效,在宏平均准确率、宏平均召回率和宏平均F_1方面优于其他方法。【局限】在向量空间模型基础上的改进,语义信息的表达不够充分,应继续探索文本建模的真正语义化实现方法;应对多种类型数据进行实验验证,以提高方法的适用性。【结论】探索原始向量空间模型的语义化问题,对当前文本分类及其语义关联等研究具有现实意义。 [Objective] This paper improves the methods of text classification based on VSM using semantic increment, and the model is verified by experiments. [Methods] Combing the studies of semantic vector and its improvement in text representation, this paper improves VSM based on semantic increment, and proposes an implementation frame of semantic vector representation of texts. Furthermore, based on the mapping relationships between words and concepts in domain Ontology, the construction of concept hierarchy tree and words positioning are constructed, semantic similarity of concepts is calculated, and the semantic vector model of texts' representation is achieved. [Results] The comparative experiments of texts classification demonstrate that the proposed method is feasible and effective, and the performance of this method is better than traditional methods from the perspectives of Precison, Recall and F1-Measure. [Limitations] The description of text semantic information is not good enough, and it is necessary to explore the authentic semantic methods in text modeling. In addition, more comparative experiments on several datasets should be conducted in order to obtain more accurate results. [Conclusions] The semantic improvement on traditional VSM is explored which is important for further text classification and semantic association.
作者 胡吉明 肖璐
出处 《现代图书情报技术》 CSSCI 北大核心 2014年第10期49-55,共7页 New Technology of Library and Information Service
基金 国家自然科学基金青年项目"社会网络环境下基于用户-资源关联的信息推荐研究"(项目编号:71303178) 武汉大学人文社会科学研究项目"社会网络环境下基于关系社区发现的用户建模研究"(项目编号:274013)的研究成果之一
关键词 文本建模 语义向量空间模型 语义增量 语义相似度 Text modeling Semantic Vector Space Model Semantic increment Semantic similarity
  • 相关文献

参考文献27

  • 1Salton G, Wong A, Yang C S. A Vector Space Model for Automatic Indexing [J]. Communications of the ACM, 1975, 18(1 1): 613-620.
  • 2Liu G Z. The Semantic Vector Space Model (SVSM): A Text Representation and Searching Technique [C]. In: Proceedings of the 27th Hawaii International Conference on System Science. 1994:928-937.
  • 3杨玉珍,刘培玉,姜沛佩.向量空间模型中结合句法的文本表示研究[J].计算机工程,2011,37(3):58-60. 被引量:6
  • 4Chang B, Dho H, Lee Y, et al. Concept Based Learning Contents Retrieval by Using Extended Vector Space Model with Ontology [J]. Information-an International Interdisci- plinary Journal, 2012, 15(2): 793-804.
  • 5Tasi C, Huang Y, Liu C, et al. Applying VSM and LCS to Develop an Integrated Text Retrieval Mechanism [J]. Expert Systems with Applications, 2012, 39(4): 3974-3982.
  • 6Virpioja S, Paukkeri M, Tripathi A, et al. Evaluating Vector Space Models with Canonical Correlation Analysis [J].Natural Language Engineering, 2012, 18(3): 399-436.
  • 7Nasir J A, Varlamis I, Karim A, et al. Semantic Smoothing for Text Clustering [J]. Knowledge-Based Systems, 2012, 54: 216-229.
  • 8Sbattella L, Tedesco R. A Novel Semantic Information Retrieval System Based on a Three-level Domain Model [J]. Journal of Systems and Software, 2013, 86(5): 1426-1452.
  • 9Liu G Z. Semantic Vector Space Model: Implementation and Evaluation [J]. Journal of the American Society for Information Science, 1997, 48(5): 395-417.
  • 10Zadeh P D H, Reformat M Z. Assessment of Semantic Similarity of Concepts Defined in Ontology [J]. Information Sciences, 2013, 250: 21-39.

二级参考文献47

共引文献52

同被引文献80

引证文献6

二级引证文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部