期刊文献+

基于双语主题模型和双语词向量的跨语言知识链接 被引量:6

Cross-language Knowledge Linking Based on Bilingual Topic Model and Bilingual Embedding
下载PDF
导出
摘要 跨语言知识链接是指在描述相同内容的不同语言的在线百科文章之间建立联系。跨语言知识链接可分为候选集选择和候选集排序两部分。首先,把候选集选择问题转换为跨语言信息检索问题,提出一种将标题与关键词相结合从而生成查询的方法,该方法将候选集选择的召回率大幅提高至93.8%;在候选集排序部分,提出一种融合双语主题模型及双语词向量的排序模型,实现了英文维基百科和中文百度百科之间军事领域的跨语言知识链接。实验结果表明,该模型取得了75%的准确率,显著提高了跨语言知识链接的性能,并且提出的方法不依赖于语言特性和领域特性,因此可以很容易地扩展至其他语言和其他领域的跨语言知识链接。 Cross-language knowledge linking(CLKL)refers to the establishment of links between encyclopedia articles in different languages that describe the same content.CLKL can be divided into two parts:candidate selection and candidate ranking.Firstly,this paper formulated candidate selection as cross-language information retrieval problem,and proposed a method to generate query by combining title with keywords,which greatly improves the recall of candidate selection,reaching 93.8%.In the part of the candidate ranking,this paper trained a ranking model by mixing bilingual topic model and bilingual embedding,implementing military articles linking in English Wikipedia and Chinese Baidu Baike.The evaluation results show that the accuracy of model achieves 75%,which significantly improves the performance of CLKL.The proposed method does not depend on linguistic characteristics and domain characteristics,and it can be easily extended to CLKL in other languages and other domains.
作者 余圆圆 巢文涵 何跃鹰 李舟军 YU Yuan-yuan;CHAO Wen-han;HE Yue-ying;LI Zhou-jun(School of Computer Science and Engineering,Beihang University,Beijing 100191,China;National Computer Network Emergency Response Technical Team/Coordination Center,Beijing 100029,China)
出处 《计算机科学》 CSCD 北大核心 2019年第1期238-244,共7页 Computer Science
关键词 跨语言知识链接 跨语言信息检索 双语主题模型 双语词向量 Cross-language knowledge linking Cross-language information retrieval Bilingual topic model Bilingual embedding
  • 相关文献

同被引文献72

引证文献6

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部