期刊文献+

基于替换方法的无监督双语词典抽取

Unsupervised bilingual lexicon induction based on word substitution
下载PDF
导出
摘要 双语词典抽取任务是自然语言处理一个重要课题。本文基于替换方法重新训练词向量,使得词向量具有跨语言特性。本文主要研究了训练词典的获取方法,以及词向量共训练模型,在中英维基百科语料上进行实验。实验结果表明,按照确信度的方法选取训练词典,基于替换的方法得到的词向量跨语言性质较好,最终抽取的词典具有较高的准确率。 Bilingual lexicon induction is an important task in natural language processing.This paper retrains the word vector based on the substitution method,so that the word embedding gets cross-language characteristics.This paper mainly studies the acquisition of training dictionary and the co-training model of word vector,and carries out experiments on the corpus of Chinese and English Wikipedia.The experimental results show that using the selected training dictionary according to the method of confidence,the word vector obtained by the method of substitution has a good cross-language property,and the dictionary extracted finally has a high accuracy.
作者 郭晋鹏 曹海龙 GUO Jinpeng;CAO Hailong(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)
出处 《智能计算机与应用》 2021年第3期217-219,共3页 Intelligent Computer and Applications
关键词 双语词典抽取 无监督 替换方法 bilingual lexicon induction unsupervised learning substitution method
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部