期刊文献+

面向社区问答的中文短文本分类算法研究 被引量:3

Research on Chinese Short Text Classification Algorithm for Community-based Q&A
下载PDF
导出
摘要 为解决社区问答系统中的问题短文本特征词少、描述信息弱的问题,本文利用维基百科进行特征扩展以辅助中文问题短文本分类。首先通过维基百科概念及链接等信息进行词语相关概念集合抽取,并综合利用链接结构和类别体系信息进行概念间相关度计算。然后以相关概念集合为基础进行特征扩展以补充文本特征语义信息。实验结果表明,本文提出的基于特征扩展的短文本分类算法能有效提高问题短文本分类效果。 In order to resolve the problems of the lack words and describe weak signals of question in community- based Q&A system, this paper proposed a method of feature extension which based on Wikipedia to help Chinese question classification. First, the set of related concept were extracted from Wikipedia and concept associativity was calculated based on the concept pages, links and so on, and concept asseciativity was calculated based on the combination of link structure and category system. Second, the semantic information of text features was supplied by feature extension according to the set of related concept. Results of experiments showed that the short text classisification algorithm which was proposed by this paper can get better classified effect.
作者 赵辉 刘怀亮
出处 《现代情报》 CSSCI 2013年第10期70-74,共5页 Journal of Modern Information
关键词 社区问答 维基百科 特征扩展 短文本分类 community- based Q&A Wikipedia feature extension shor text classification
  • 相关文献

参考文献12

二级参考文献70

共引文献135

同被引文献62

引证文献3

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部