基于替换方法的无监督双语词典抽取

Unsupervised bilingual lexicon induction based on word substitution

下载PDF

导出

摘要双语词典抽取任务是自然语言处理一个重要课题。本文基于替换方法重新训练词向量,使得词向量具有跨语言特性。本文主要研究了训练词典的获取方法,以及词向量共训练模型,在中英维基百科语料上进行实验。实验结果表明,按照确信度的方法选取训练词典,基于替换的方法得到的词向量跨语言性质较好,最终抽取的词典具有较高的准确率。 Bilingual lexicon induction is an important task in natural language processing.This paper retrains the word vector based on the substitution method,so that the word embedding gets cross-language characteristics.This paper mainly studies the acquisition of training dictionary and the co-training model of word vector,and carries out experiments on the corpus of Chinese and English Wikipedia.The experimental results show that using the selected training dictionary according to the method of confidence,the word vector obtained by the method of substitution has a good cross-language property,and the dictionary extracted finally has a high accuracy.

作者郭晋鹏曹海龙 GUO Jinpeng;CAO Hailong(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)

机构地区哈尔滨工业大学计算机科学与技术学院

出处《智能计算机与应用》 2021年第3期217-219,共3页 Intelligent Computer and Applications

关键词双语词典抽取无监督替换方法 bilingual lexicon induction unsupervised learning substitution method

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1彭彤彤."顶多"与"最多"的主观性差异[J].汉语应用语言学研究,2020(1):85-95.
2简伟伟.巧用审辨式思维突破动力学疑难问题——以一道叠加体问题的审辨式分析为例[J].教学考试,2021(13):57-59.
3陈哲,李亚非.语段研究的新问题和新进展[J].现代外语,2021,44(4):576-584. 被引量：4
4沈燕,孟丽,闫诚.基于ZigBee技术的石油开采过程的远程监控系统[J].石化技术,2021,28(6):95-96.
5孙娟.证券内幕信息形成时间的精确度量[J].研究生法学,2021,36(3):55-65.
6吴迪,张旭东,范之国,孙锐.基于光场内联遮挡处理的噪声场景深度获取[J].光电工程,2021,48(7):9-22. 被引量：2
7刁喆,孙鼎,袁艺.基于WebVPN系统的数字资源获取安全机制研究[J].信息安全研究,2021,7(8):783-788. 被引量：3
8陈家隽.方式状语与新兴话语标记的产生——吴语上海方言“好(好)叫”[J].语言学论丛,2020(2):217-236. 被引量：2

智能计算机与应用

2021年第3期

浏览历史

内容加载中请稍等...

基于替换方法的无监督双语词典抽取

相关作者

相关机构

相关主题

浏览历史