期刊文献+

感知器在语言模型训练中的应用 被引量:2

Perceptron for Language Modeling
下载PDF
导出
摘要 感知器(perceptron)是神经网络模型中的一种,它可以通过监督学习(supervised learning)的方法建立模式识别的能力.将感知器应用到语言模型的训练中,实现了感知器的两种不同训练规则以及多种特征权值计算方法,讨论了不同的训练参数对训练效果的影响.在训练之前,使用了一种基于经验风险最小化(empirical risk minimization,ERM)的特征选择算法确定特征集合.感知器训练之后的语言模型在日文假名到汉字(kana-kanji)的转换中进行评估.通过实验对比了感知器的两种训练规则以及变形算法的性能,同时发现通过感知器训练的模型比传统模型(N-gram)在性能上有了很大的提高,使相对错误率下降了15%~20%. Perceptron is one type of neural networks (NN) which can acquire the ability of pattern recognition by supervised learning. In this paper, two perceptron training rules for language modeling (LM) are introduced as an alternative to the traditional training method such as maximum likelihood estimation (MLE). Variants of perceptron learning algorithms are presented and the impact of different training parameters on performance is discussed. Since there is a strict restriction on the language model size, feature selection is conducted based on the empirical risk minimization (ERM) principle before modeling. The model performance is evaluated in the task of Japanese kana-kanji conversion which converts phonetic strings into the appropriate word strings. An empirical study on the variants of perceptron learning algorithms is conducted based on the two training rules, and the results also show that perceptron methods outperform substantially the traditional methods for LM.
出处 《计算机研究与发展》 EI CSCD 北大核心 2006年第2期260-267,共8页 Journal of Computer Research and Development
基金 浙江省重大科技攻关基金项目(2003C11009) 上海市科委科技发展基金项目(025111051)
关键词 感知器 语言模型 经验风险最小化 pereeptron language model empirical risk minimization
  • 相关文献

参考文献10

  • 1Simon Haykin, Neural Networks: A Comprehensive Foundation.New Jersey: Prentice Hall, 1999.
  • 2Jelinek, Fred. Self-organized language modeling for speech recognition. In: A. Waihel, K. F. Lee, eds. Readings of Speech Recognition. San Francisco: Morgan.Kaufmann, 1990. 450-506.
  • 3Collins, Michael. Parameter estimation for statistical parsing models: Theory and practice of distribution-free methods.1WPT01, Beijing, 2001.
  • 4Oao Jianfeng, Hisami Suzuki. Capturing long distance dependency in language modeling: An empirical study, IJCNLP-04, Sanya,Hainan, 2004.
  • 5Collins, Michael. Discriminative training methods for hidden Markov model: Theory and experiments with the perceptron algorithm. EMNLP2002, Philadelphia, PA, USA, 2002.
  • 6Collins, Michael, Terry Koo. Discriminative reranking for natural language parsing. In: Proc. 17th Int'1 Conf. Machine Learning.San Francisco: CA: Morgan Kaufmann, 2002. 175-182.
  • 7Vladimir N. Vapnik. The Nature of Statistical Learning Theory.New York: Springer-Verlag, 1995.
  • 8Gao Jianfeng, Joshua Goodman, Mingjing Li, et al. Toward a unified approach to statistical language modeling for Chinese.ACM TALIP, 2002, 1(1): 3-33.
  • 9Gao Jianfeng, Hisami Suzuki, Yang Wen. Exploiting headword dependency and predictive clustering for language modeling.EMNLP02, Philadelphia, PA, USA, 2002.
  • 10Juang Biing-Hwang, Wu Chou, Chin Hui Lee. Minimum classification error rate methods for speech recognition. IEEE Trans. Speech and Audio Processing, 1997, 5 (3) : 257-- 265.

同被引文献19

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部