唇读中基于像素的特征提取方法的研究被引量：3

Research of pixel based feature extraction in lip-reading

下载PDF

导出

摘要针对单独视觉通道唇读中的基于像素的特征提取问题,提出一个级联的特征提取策略。首先对图像采用相应的变换,然后对变换结果降维,最后进行特征归一化。基于对几种变换方法的比较与分析,提出利用PCA对DCT和Gabor小波变换结果降维的DCT-PCA和Gabor-PCA方法,与传统人工选择变换系数的方法相比识别率提高了约10%。 This paper concentrates on the pixel based feature extraction in only visual channel lip-reading system.A three-stage cascade visual front end is proposed.The first stage is corresponding transform to be performed over the image,the second stage is to reduce the dimensions of the transformed image,in the third stage all feature vectors are normalized into a uniform scale. We apply PCA to reduce the dimension of DCT and Gabor transformed data called DCT-PCA and Gabor-PCA,which can improve the recognition accuracy by 10% compared with the manually-selected features.

作者万玉奇姚鸿勋洪晓鹏

机构地区哈尔滨工业大学计算机系

出处《计算机工程与应用》 CSCD 北大核心 2007年第20期197-199,221,共4页 Computer Engineering and Applications

基金新世纪优秀人才支持计划(No.NCET-05-0334) 黑龙江省自然科学基金(the Natural Science Foundation of Heilongjiang Province of China under Grant No.E2005-09)

关键词唇读特征提取 PCA DCT GABOR lip-reading feature extraction PCA DCT Gabor

分类号 TP39 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1姚鸿勋,高文,王瑞,郎咸波.视觉语言——唇读综述[J].电子学报,2001,29(2):239-246. 被引量：31
2Potamianos G.A cascade image transform for speaker independent automatic speechreading[C]//IEEE International Conference on Multimedia and Expo,2:1097-1100.
3Potamianos G,Graf H P,Cosatto E.An image transform approach for HMM based automatic lipreading[C]//Proc Int Conf Image Process,Chicago,1998:173-177.
4Scanlon P,Reilly R.Visual feature analysis for automatic speechreading[C]//Proc Works Multimedia Signal Processing,Cannes,France,Oct 3-5,2001,2001:625-630.
5Matthews et al.Extraction of visual features for lipreading[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2002,24(2).
6Bregler C,Konig Y.Eigenlips for robust speech recognition[C]//Proc Int Conf Acoust Speech Signal Process,Adelaide,1994:669-672.
7Duchnowski P.Toward movement-invariant automatic lipreading and speech recognition[C]//Proc Int Conf Acoust Speech Signal Process,Detroit,1995:109-112.
8Neti C.Audio-visual speech recognition,Final Summer 2000 Work Shop Report[R].Center for Language and Speech Processing,Baltimore,2000.
9Heckmann M.DCT-based video features for audio-visual speech recognition[C]//Proc Int Conf Spoken Lang Process Denver,USA,September 2002,2002:1925-1928.
10Liu C,Wechsler H.Gabor feature based classification using enhanced fisher linear discriminant model for face recognition[J].IEEE Trans Image Processing,2002,11 (4):467-476.

二级参考文献13

1王瑞.连续语音唇读识别的研究.哈尔滨工业大学计算机系博士论文开题报告[M].哈尔滨工业大学档案馆,1998..
2徐彦君.中文双语料语音识别关键技术研究：博士论文[M].北京:中科院语音所,1998..
3间濑健二.读唇[J].电子情报通信学会论文志,1990,73(6):796-803.
4Yao H，IEEE Fourth Int Conference on Signal Processing，1998年，912页
5徐彦君，博士学位论文，1998年
6王瑞，博士论文开题报告，1998年
7Liu M B，计算机学报，1998年，21卷，6期，527页
8Li N，http://www.cs.ucf.edu/～vision/papers/shah/97/NDS97 pdf，1997年
9Chiou G I，IEEE Trans Image Processing，1997年，6卷，8期，1192页
10Dai Y，Pattern Recognition，1996年，29卷，6期，1007页

共引文献30

1荣传振,岳振军,贾永兴,王渊,杨宇.唇语识别关键技术研究进展[J].数据采集与处理,2012,27(S2):277-283. 被引量：4
2王志良,解仑,董平.情感计算数学模型的研究初探[J].计算机工程,2004,30(21):33-34. 被引量：8
3洪晓鹏,姚鸿勋,徐铭辉.基于句子级的唇读语料库及其切分算法[J].计算机工程与应用,2005,41(3):174-177. 被引量：7
4雷江华.看话训练在我国口语教学中的地位与作用[J].中国特殊教育,2005(4):36-41. 被引量：6
5鹿佳,姚鸿勋.改进AdaBoost对基于HMM的唇读系统识别率的提高[J].哈尔滨商业大学学报（自然科学版）,2005,21(5):604-607.
6刘庆辉,姚鸿勋.基于唇动的说话人识别技术[J].计算机工程与应用,2006,42(12):85-88.
7李刚,王蒙军,林凌.面向残疾人的汉语可视语音数据库[J].中国生物医学工程学报,2007,26(3):355-360. 被引量：3
8陈蓉,姚鸿勋,洪晓鹏,万玉奇.视觉单通道唇读系统的有效性[J].计算机工程与应用,2007,43(20):28-30. 被引量：2
9张志文,沈海斌.基于色度分布差异性的唇部检测算法[J].浙江大学学报（工学版）,2008,42(8):1355-1359. 被引量：6
10王丹,姚鸿勋,万玉奇,洪晓鹏.唇读中的HLM模型及其文字流解析[J].计算机科学,2008,35(12):171-174. 被引量：1

同被引文献27

1HONG Xiaopeng,YAO Hongxun,WAN Yuqi,et al. A PCA based visual DCT feature extraction method for lip-reading [EB/OL].[2010- 04 -20].http ://www.computer.org/portal/web/csdl/doi/10.1109/IIH -MSP. 2006.35.
2HE Jun,ZHANG Hua. Research on visual speech feature extraction [EB/OL].[2010-04-20].http ://portal.acm.org/citation.cfm?id= 1511096.
3MATI'HEWS I,COOTES T F,BANGHAM J A,et al. Extraction of visual features for lipreading [J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24 (2) : 198-213.
4POTAMIANOS G,NETI C,GRAVIER G,et al. Recent advances in the automatic recognition of audio-visual speech[J].Proceedings of the IEEE ,2003,91 (9) : 1306-1326.
5HECKMANN M. DCT-based video features for audio-visual speech recognition [EB/OL].[2010-04-20].http ://www.icp.inpg.fr/-bertho/ref/ heckmann-icslp02.pdf.
6KINGSBURY N G. The dual-tree complex wavelet transform: a new technique for shift invariance and directional filters [EB/OL].[2010- 04-20].http ://publications.eng.cam.ac.uk/12273/.
7KINGSBURY N G. Shift invariant properties of the dual-tree complex wavelet transform[EB/OL].[2010-04-20].http ://portal.acm.org/eitation. cfm?id= 1257691.
8SELESNICK I W,BARANIUK R G,KINGSBURY N G. The dual- tree complex wavelet transform [EB/OL]. [2010-04-20].http://en. wikipedia.org/wiki/Complex_wavelet_transform.
9Hong Xiao-peng, Yao Hong xun, Wan Yu-qi, et al. A PCA based Visual Feature Extraction Method for Lip-Reading[C]//Proceeding of the 2006 international Conference on intelligent information Hiding and Multimedia Signal Proeessing(IIH-MSW 06).
10He Jun, Zhang Hua. Research on Visual Speech Feature Extraction[C]//2009 international Conference on Computer Engineering and Technology. 2009:499-502.

引证文献3

1梁亚玲,杜明辉.基于DT-CWT和PCA的唇部特征提取方法[J].电视技术,2011,35(3):93-96. 被引量：1
2梁亚玲,杜明辉.基于DCT和ONPP的唇部特征提取[J].计算机科学,2011,38(5):261-264. 被引量：1
3梁亚玲,杜明辉.基于唇部灰度能量图的唇读方法[J].华南理工大学学报（自然科学版）,2011,39(7):88-94. 被引量：1

二级引证文献3

1张毅,刘娇,罗元,胡豁生.基于唇形的智能轮椅人机交互[J].控制工程,2013,20(3):501-505. 被引量：5
2廖广军,曾玮,刘屿,李致富,胡跃明.基于信息融合的三维人脸识别[J].华南理工大学学报（自然科学版）,2013,41(11):16-22.
3卢开宏.基于唇部视觉特征的语言分类识别研究[J].信息技术与信息化,2015(7):48-50. 被引量：1

1许元飞.综合特征在图像检索中的应用[J].才智,2011,0(14):87-88.
2李从信,尹晓喆,郭军辉,杨冬黎.虚拟培训在油田安全操作仿真系统中的应用[J].佳木斯大学学报（自然科学版）,2006,24(1):65-67. 被引量：1
3孙君顶,毋小省.基于熵及不变矩特征的图像检索[J].光电工程,2007,34(6):102-106. 被引量：14
4李文兵,尹琦,李勃,刘辉,李红伦,阮湘辉.基于多特征的对象检索技术[J].山西电子技术,2009(5):49-49.
5张立和,伍宏涛,胡昌利.基于三维Gabor变换的视频水印算法[J].软件学报,2004,15(8):1252-1258. 被引量：16
6张景骞,刘思源,赵沁平.虚拟环境中多视觉通道图形显示的实现[J].计算机工程与应用,2001,37(7):23-26. 被引量：3
7焦蓬蓬,郭依正,卫星,刘丽娟.基于LVQ神经网络的医学图像识别研究[J].科学技术与工程,2012,20(18):4535-4537. 被引量：3
8赵松年,熊小芸,姚国正,富志.视觉通道的信息处理——Ⅱ.数值仿真实验[J].自然科学进展（国家重点实验室通讯）,1999,9(1):77-83. 被引量：1
9杭千勇.基于本体的医学图像检索系统[J].科技情报开发与经济,2008,18(9):134-136. 被引量：1
10羚闻.爱护眼睛电脑使用者眼睛保护手册——刷新率篇[J].中国眼镜科技杂志,2005(2):78-80.

计算机工程与应用

2007年第20期

浏览历史

内容加载中请稍等...

唇读中基于像素的特征提取方法的研究被引量：3

参考文献14

二级参考文献13

共引文献30

同被引文献27

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

唇读中基于像素的特征提取方法的研究 被引量：3

参考文献14

二级参考文献13

共引文献30

同被引文献27

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

唇读中基于像素的特征提取方法的研究被引量：3