期刊文献+

基于word2vec模型的专业通用词提取算法及应用举例

An Algorithm of Extraction of General Words for Specific Purposes Based on word2vec and Its Application
下载PDF
导出
摘要 专业通用词是某一专业领域中所使用的通用词汇,在翻译过程中往往较难把握。目前,专业通用词主要依靠人工提取,这对分析人员的语言素养及其对语料的熟悉程度提出了较高要求,同时存在提取效率问题。基于Google发布的神经网络机器学习算法模型word2vec,提出一套专业通用词的自动提取算法,并通过Python 2.7编写的脚本实现。以国际财务报告准则语料库为例,对该算法的应用加以说明。 General words for specific purposes (GWSP) are defined as the general words used in a specific field, which are difficult to translate. At present, the common way to extract such words relies greatly on manual work, for which reason, better language proficiency of the translator and his/her familiarity with the text are required. Meanwhile, the efficiency of extraction is relatively low. This paper introduces an algorithm that can extract GWSP automatically. This algorithm is based on word2vec, a neural network learning algorithm published by Google, and can be done by scripts programmed in Python 2.7. The application of this algorithm is demonstrated in the International Financial Reporting Standards corpus.
作者 田艳 王天奇 TIAN Yan;WANG Tian-qi(School of Foreign Language,Shanghai JiaoTong Uinversity,Shanghai 200240,China)
出处 《沧州师范学院学报》 2018年第3期68-72,共5页 Journal of Cangzhou Normal University
基金 国家社科基金项目"动态翻译学习的在线系统构建及其评估研究" 编号:No.16BYY081 教育部人文社会科学研究一般项目"基于语料库的马克思<资本论>汉译研究" 编号:No.15YJA740009
关键词 word2vec 专业通用词提取 语料库翻译 word2vec the extraction of general words for specific purposes corpus translation
  • 相关文献

参考文献1

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部