期刊文献+

基于结构-语义图的短文本分类

Short Text Classification Based on Structure-Semantic Graph
下载PDF
导出
摘要 随着互联网中信息技术的高速发展,各类新媒体应用如Tweet、微博、问答系统等,无时无刻都产生着海量的文本数据,这些文本数据大多为短文本,具有特征稀疏、用词多样、口语化、上下文依赖强等特征。目前较为常用的文本分类方法大多是基于向量空间模型,但该方法是假设词与词之间的相互独立为前提,这样就无法利用文本的内部结构信息。针对现有短文本分类算法的不足,提出基于结构-语义图的短文本分类方法,通过将文本中的语序结构映射到图结构中,同时基于Probase考虑不同词性的词对名词语义的影响,结合外部概念知识库来提高短文本分类的性能。 With the rapid development of information technology in the Internet,various new media applications such as Tweet,Weibo,Q&A system,etc.generate massive text data all the time.These text data are mostly short texts with sparse features and diverse words colloquialism and contextual dependence.At present,most commonly used text categorization methods are based on vector space models,but this method assumes that the words and words are independent of each other,so that the internal structure information of the text cannot be utilized.Aiming at the disadvantages of the existing short text classification algorithm,proposes a short text classification method based on structure-semantic graph,which maps the word order structure in the text to the graph structure,and considers the influence of different parts of speech on the noun semantics based on Probase,combined with an external concept knowledge base to improve the performance of short text classification.
作者 胡代艳 HU Dai-yan(College of Computer Science,Sichuan University,Chengdu 610065)
出处 《现代计算机》 2019年第5期18-21,26,共5页 Modern Computer
关键词 短文本 Probase 语义 图结构 Short Text Probase Semantic Graph Structure
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部