期刊文献+

单词嵌入——自然语言的连续空间表示 被引量:10

Word Embedding:Continuous Space Representation for Natural Language
下载PDF
导出
摘要 单词嵌入是指运用机器学习的方法,将位于高维离散空间(维数为词典单词数目)中的每个单词映射到低维连续空间的实数向量的技术。在很多文本处理的任务中,单词嵌入提供了更好的语义级别的单词特征表示,从而为文本处理任务带来了诸多便利。同时,大数据时代海量的未标注文本数据,以及以深度学习为代表的机器学习技术的发展使高效的单词嵌入技术成为可能。本文将给出单词嵌入的定义以及实际意义,同时将综述目前单词嵌入技术的几种典型方法,包括基于神经网络的方法、基于受限玻尔兹曼机的方法以及基于单词与上下文共生矩阵分解的方法。本文将详细介绍不同模型的数学定义、物理意义以及训练方法,并给出他们之间的比较。 Word embedding refers to a machine learning technology which maps seach of word lying in high-dimensional discrete space (with dimension to be the number of all words) to a real number vector in low-dimensional continuous space. Word embedding provides better se- mantic word representations, and thus greatly benefits text processing tasks. Meanwhile, huge amount of unlabeled text data, together with the development of advanced machine learning techniques such as deep learning, make it possible to effectively obtain high quality word em- beddings. Besides, the definition and practical value of word embedding are given, and some classical methods are also reviewed to obtain word embedding, including neural network based methods, restricted Bohzmann machine based methods, and methods based on factorization of context co-occurrence matrix. For each model, its mathematical definition, physical meaning are introduced in detail, as well as training procedure. In addition, all these methods are com- pared in the aforementioned three aspects.
出处 《数据采集与处理》 CSCD 北大核心 2014年第1期19-29,共11页 Journal of Data Acquisition and Processing
关键词 机器学习 自然语言 单词嵌入 文本处理 machine learning natural language word embedding text processing
  • 相关文献

参考文献20

  • 1Rie Kubota Ando, Tong Zhang. A high-performance semi-supervised learning method for text chunking [C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACLt 05). Stroudsburg, PA, USA: Association for Com- putational Linguistics, 2005 : 1-9.
  • 2Suzuki J, Isozaki H. Semi-supervised sequential labe- ling and segmentation using giga-word scale unlabeled data[C]//Proceedings of the 46th Annual Meeting on Association for Computational Linguistics (ACL' 08). Columbus, Ohio, USA: Association for Com- putational Linguistics,2008 : 665-673.
  • 3Suzuki J, Isozaki H, Carreras X, et al. An empirical study of semi-supervised structured conditional mod- els for dependency parsing[C]//Proceedings of the 2009 Conference on Empirical Methods in NaturalLanguage Processing: Volume 2. [S. 1. ]:Association for Computational Linguistics, 2009 : 551-560.
  • 4Bengio Y, Ducharme R, Vincent P. A neural proba- bilistic language model[C]//Advances in Neural In- formation Processing Systems. Vancouver, British Columbia, Canada: Neural Information Processing Systems Foundation, 2001: 933-938.
  • 5Bengio Y, Ducharme R, Vincent P, et al. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003, 3: 1137-1155.
  • 6Morin F, Bengio Y. Hierarchical probabilistic neural network language model [C]//Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics. Barbados: ACM, 2005:246-252.
  • 7Mikolov T, Chen K, Corrado G, et al. Efficient esti- mation of word representations in vector space[EB/ OL]. arXiv preprint arXiv, 2013:1301:3781.
  • 8Mnih A, Hinton G. Three new graphical models for statistical language modelling[C]//Proceedings of the 24th International Conference on Machine Learning. Corvallis, USA: ACM, 2007: 641-648.
  • 9Mnih A, Hinton G E. A scalable hierarchical distrib- uted language model[C]//Advances in Neural Infor- mation Processing Systems. Vancouver, B C, Cana- da: Neural Information Processing Systems Founda- tion, 2008: 1081-1088.
  • 10Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their com- positionality[C]//Advances in Neural Information Processing Systems. Nevada, United States: Neural Information Processing Systems Foundation, 2013: 3111-3119.

同被引文献62

引证文献10

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部