期刊文献+

一种Word2vec构建词向量模型的实现方法 被引量:6

A Method for Implementing Word2vec to Construct Word Embedding Model
下载PDF
导出
摘要 Word2vec是一种基于简单神经网络的自然语言处理方法,是一种词嵌入技术,可用于构建高维词向量。研究针对Word2vec词向量表示方法进行模型构建和分析,通过NLPCC2014语料训练,将词映射到高维词向量空间中,完成了Word2vec的功能实现以及可视化输出。实验中进一步针对CBOW模型与Skip-gram模型,这两种Word2vec中的重要模型进行对比研究,输出结果表明:在通过大语料训练中文词向量时,Skip-gram模型在新词识别上具有明显优势,综合模型准确性与时间性能来说,总体可靠性更优。 Word2vec is a Natural Language Processing method based on a simple neural network.It is a word embedding technology that can be used to construct high-dimensional word vectors.The research carried out model construction and analysis for the Word2vec word vector representation method.Through NLPCC2014 corpus training,words were mapped into the high-dimensional word vector space,and completed the function realization and visual output of Word2vec.The experiment further focused on the CBOW model and the Skip-gram model,which are two important models in Word2vec.The output results show that the Skip-gram model has obvious advantages in new word recognition when training Chinese word vectors through a large corpus.In terms of comprehensive model accuracy and time performance,overall reliability is better.
作者 席宁丽 朱丽佳 王录通 陈俊 万晓容 XI Ning-li;ZHU Li-jia;WANG Lu-tong;CHEN Jun;WANG Xiao-rong(School of Education,Guizhou Normal University,Guiyang 550025,China;School of Foreign Language,Guizhou Normal University,Guiyang 550025,China;School of Teacher Education,Guangxi Modern Vocational and Technical College,Hechi 547099,China)
出处 《电脑与信息技术》 2023年第1期43-46,共4页 Computer and Information Technology
基金 贵州省教育厅高校人文社会科学研究项目(项目编号:2020GH015)。
关键词 词向量 Word2vec CBOW Skip-gram NLP word embedding Word2vec CBOW Skip-gram NLP
  • 相关文献

参考文献4

二级参考文献30

共引文献134

同被引文献36

引证文献6

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部