感知器在语言模型训练中的应用被引量：2

Perceptron for Language Modeling

下载PDF

导出

摘要感知器（perceptron）是神经网络模型中的一种，它可以通过监督学习（supervised learning）的方法建立模式识别的能力．将感知器应用到语言模型的训练中，实现了感知器的两种不同训练规则以及多种特征权值计算方法，讨论了不同的训练参数对训练效果的影响．在训练之前，使用了一种基于经验风险最小化（empirical risk minimization，ERM）的特征选择算法确定特征集合．感知器训练之后的语言模型在日文假名到汉字（kana-kanji）的转换中进行评估．通过实验对比了感知器的两种训练规则以及变形算法的性能，同时发现通过感知器训练的模型比传统模型（N-gram）在性能上有了很大的提高，使相对错误率下降了15％～20％． Perceptron is one type of neural networks （NN） which can acquire the ability of pattern recognition by supervised learning. In this paper, two perceptron training rules for language modeling （LM） are introduced as an alternative to the traditional training method such as maximum likelihood estimation （MLE）. Variants of perceptron learning algorithms are presented and the impact of different training parameters on performance is discussed. Since there is a strict restriction on the language model size, feature selection is conducted based on the empirical risk minimization （ERM） principle before modeling. The model performance is evaluated in the task of Japanese kana-kanji conversion which converts phonetic strings into the appropriate word strings. An empirical study on the variants of perceptron learning algorithms is conducted based on the two training rules, and the results also show that perceptron methods outperform substantially the traditional methods for LM.

作者于浩步丰林高剑峰

机构地区上海交通大学计算机科学与工程系微软亚洲研究院

出处《计算机研究与发展》 EI CSCD 北大核心 2006年第2期260-267,共8页 Journal of Computer Research and Development

基金浙江省重大科技攻关基金项目(2003C11009) 上海市科委科技发展基金项目(025111051)

关键词感知器语言模型经验风险最小化 pereeptron language model empirical risk minimization

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] TP391.2 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1Simon Haykin, Neural Networks: A Comprehensive Foundation.New Jersey: Prentice Hall, 1999.
2Jelinek, Fred. Self-organized language modeling for speech recognition. In: A. Waihel, K. F. Lee, eds. Readings of Speech Recognition. San Francisco: Morgan.Kaufmann, 1990. 450-506.
3Collins, Michael. Parameter estimation for statistical parsing models: Theory and practice of distribution-free methods.1WPT01, Beijing, 2001.
4Oao Jianfeng, Hisami Suzuki. Capturing long distance dependency in language modeling: An empirical study, IJCNLP-04, Sanya,Hainan, 2004.
5Collins, Michael. Discriminative training methods for hidden Markov model: Theory and experiments with the perceptron algorithm. EMNLP2002, Philadelphia, PA, USA, 2002.
6Collins, Michael, Terry Koo. Discriminative reranking for natural language parsing. In: Proc. 17th Int'1 Conf. Machine Learning.San Francisco: CA: Morgan Kaufmann, 2002. 175-182.
7Vladimir N. Vapnik. The Nature of Statistical Learning Theory.New York: Springer-Verlag, 1995.
8Gao Jianfeng, Joshua Goodman, Mingjing Li, et al. Toward a unified approach to statistical language modeling for Chinese.ACM TALIP, 2002, 1(1): 3-33.
9Gao Jianfeng, Hisami Suzuki, Yang Wen. Exploiting headword dependency and predictive clustering for language modeling.EMNLP02, Philadelphia, PA, USA, 2002.
10Juang Biing-Hwang, Wu Chou, Chin Hui Lee. Minimum classification error rate methods for speech recognition. IEEE Trans. Speech and Audio Processing, 1997, 5 (3) : 257-- 265.

同被引文献19

1陈文亮,朱靖波,朱慕华,姚天顺.基于领域词典的文本特征表示[J].计算机研究与发展,2005,42(12):2155-2160. 被引量：22
2胡俊锋,陈浩,陈蓉,谭斌,于中华.基于感知器的生物医学命名实体边界识别算法[J].计算机应用,2007,27(12):3026-3028. 被引量：2
3THEODORIDIS S, KOUTROUMBAS K. Pattern Recognition [ M ]. Forth Edition. Holland : Elsevier,2008:91 - 103.
4ROSENBLATY F. The perceptron: only A probabilistic model for information storage and organization in the brain[ J]. Psychological Review, 1958,65:386 - 408.
5郝媛媛,叶强,李一军.基于影评数据的在线评论有用性影响因素研究[J].管理科学学报,2010,13(8):78-88. 被引量：239
6琚春华,郭飞鹏.基于支持向量机的分布数据挖掘模型DSVM[J].系统工程理论与实践,2010,30(10):1855-1863. 被引量：7
7杨建武.基于核方法的XML文档自动分类[J].计算机学报,2011,34(2):353-359. 被引量：14
8殷国鹏.消费者认为怎样的在线评论更有用?——社会性因素的影响效应[J].管理世界,2012,28(12):115-124. 被引量：190
9姜巍,张莉,戴翼,蒋竞,王刚.面向用户需求获取的在线评论有用性分析[J].计算机学报,2013,36(1):119-131. 被引量：56
10王伟,王洪伟.特征观点对购买意愿的影响:在线评论的情感分析方法[J].系统工程理论与实践,2016,36(1):63-76. 被引量：52

引证文献2

1刘用功.感知器在船舶航行危险评估中的应用[J].广州航海高等专科学校学报,2015,23(2):24-26. 被引量：1
2张婧,周怡欣,胡涵,卞亦文.基于知识采纳模型和多层感知机神经网络的评论有用性识别研究[J].中国管理科学,2022,30(4):264-274. 被引量：5

二级引证文献6

1刘用功,陈丹涌.LMS算法在船舶航行危险评估中的应用[J].山东交通学院学报,2016,24(2):71-74. 被引量：1
2李优柱,杨鸿宇,刘进思,付辉,陈顺杰.我国中药材价格指数预测研究[J].华中农业大学学报,2021,40(6):50-59. 被引量：7
3张瑞,何禄鑫,黄炜.多特征融合下视频网站弹幕信息有用性检测研究[J].现代情报,2022,42(4):99-109. 被引量：5
4冯套柱,谢亚雄.感知有用性视角下网络评价动态变化对顾客购买意向的影响[J].商业经济研究,2022(9):64-67. 被引量：2
5李雪梅,蒋建洪.基于改进图卷积神经网络的评论有用性识别[J].数据分析与知识发现,2022,6(11):38-51. 被引量：2
6马静.基于多层感知分类器的皮革图像缺陷识别研究[J].中国皮革,2024,53(8):40-46.

1张志斌.数据分析的三种方法[J].国外科技新书评介,2013(7):19-20.
2罗娜.数据挖掘中的新方法——支持向量机[J].软件导刊,2008,7(10):30-31. 被引量：17
3张爱,陆有忠,郑璐石.迅速崛起的机器学习技术——支持向量机[J].宁夏工程技术,2004,3(2):136-140. 被引量：1
4苏三.全面升级!——赛睿KANA V2游戏鼠标评测[J].电子竞技,2013(19):54-55.
5苏三.电竞新利器赛睿RIVAL游戏鼠标评测[J].电子竞技,2014(1):56-57.
6李志明,孔令富.用于回归估计的支持向量机[J].广西科学院学报,2005,21(4):215-218. 被引量：1
7范秋凤,陈彦涛.支持向量机及其应用研究[J].科技信息,2009(29). 被引量：3
8梁宏斌,严正俊.基于支持向量机的模式识别方法[J].现代电子技术,2007,30(16):193-194. 被引量：3
9吴建生,金龙.神经网络的统计学习理论基础[J].广西科学院学报,2005,21(2):102-105. 被引量：1
10Matsushita Tomohisa.Kanji Learning System[J].Journal of Shanghai University(English Edition),2001,5(z1):133-138.

计算机研究与发展

2006年第2期

浏览历史

内容加载中请稍等...

感知器在语言模型训练中的应用被引量：2

参考文献10

同被引文献19

引证文献2

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

感知器在语言模型训练中的应用 被引量：2

参考文献10

同被引文献19

引证文献2

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

感知器在语言模型训练中的应用被引量：2