期刊文献+

唇语识别的视觉特征提取方法综述 被引量:2

Review of Extracting Methods for Lip Visual Features
下载PDF
导出
摘要 现有唇语识别研究多专注于提高识别精度、研究多模态输入特征等方面,对提高唇部视觉特征的有效性关注不多。而唇部的视觉信息在视觉语音识别和唇语识别中起着关键作用,尤其在音频被破坏或无音频信息时,唇部视觉信息尤为重要。如何获取准确有效的唇部视觉特征是当前唇语识别的难点工作之一。从唇语数据集、传统视觉特征提取方法、视觉特征提取的深度学习方法三方面综述了唇语识别方向近年来的最新研究工作:首先,总结了唇语识别数据集,将唇语数据集分为正视图和多视图两种类型,并总结整理两类数据集的特点、局限性和下载地址;其次,从像素点、形状和混合特征的角度介绍了唇部视觉特征提取的传统方法,重点介绍各方法的基本思想、网络结构和特点;然后,介绍了唇部视觉特征提取的深度学习方法,重点介绍2D CNN、3D CNN、2D CNN与3D CNN相结合、其他神经网络四种深度学习方法的网络结构和优缺点,并比较了这些方法在公开数据集上的性能表现;最后,对唇部视觉特征提取方法所面临的挑战和未来研究趋势进行了展望。 Current research on lip recognition focuses on improving recognition accuracy and studying features of multimodal inputs.However,little attention has been paid to improving the effectiveness of lip visual features.Lip visual information plays a key role in visual speech recognition and lip recognition.It is important when audio is destroyed or has no information.How to obtain accurate and effective lip visual features is one of the most difficult tasks in lip recognition.This paper reviews the latest research work on lip recognition in recent years from three aspects:lip dataset,traditional visual feature extraction methods,and in-depth learning methods for visual feature extraction.Firstly,this paper summarizes the dataset for lip recognition.The lip dataset is divided into two types:front view and multi-view.Further two types of datasets are summarized from their characteristics,limitations,and download addresses.Secondly,this paper introduces the traditional methods of lip visual feature extraction from the perspective of pixel point,shape and mixed features.The basic idea,network structure and features of each method are mainly introduced.In the deep learning method of lip visual feature extraction,the network structure,advantages and disadvantages of four deep learning methods are mainly introduced,such as 2D CNN,3D CNN,2D CNN combined with 3D CNN,and other neural networks.The performance of these methods on open datasets is compared.Finally,the challenges faced by lip visual feature extraction methods and future research trends are prospected.
作者 马金林 巩元文 马自萍 陈德光 朱艳彬 刘宇灏 MA Jinlin;GONG Yuanwen;MA Ziping;CHEN Deguang;ZHU Yanbin;LIU Yuhao(School of Computer Science and Engineering,North Minzu University,Yinchuan 750021,China;Key Laboratory for Intelligent Processing of Computer Images and Graphics of National Ethnic Affairs Commission of the PRC,Yinchuan 750021,China;School of Mathematics and Information Science,North Minzu University,Yinchuan 750021,China)
出处 《计算机科学与探索》 CSCD 北大核心 2021年第12期2256-2275,共20页 Journal of Frontiers of Computer Science and Technology
基金 北方民族大学中央高校基本科研业务费专项(2021KJCX09,ZDZX201801) 宁夏自然科学基金(2020AAC3215) 北方民族大学“计算机视觉与虚拟现实”创新团队项目 国家自然科学基金(61462002) 北方民族大学研究生创新项目(YCX21081)。
关键词 唇语识别 视觉特征 深度学习 lip recognition visual feature deep learning
  • 相关文献

参考文献3

二级参考文献41

  • 1张建明,陶宏,王良民,詹永照,宋顺林.基于SVD的唇动视觉语音特征提取技术[J].江苏大学学报(自然科学版),2004,25(5):426-429. 被引量:3
  • 2洪晓鹏,姚鸿勋,徐铭辉.基于句子级的唇读语料库及其切分算法[J].计算机工程与应用,2005,41(3):174-177. 被引量:7
  • 3Alan L. Yuille,Peter W. Hallinan,David S. Cohen.Feature extraction from faces using deformable templates[J]. International Journal of Computer Vision . 1992 (2)
  • 4Michael Kass,Andrew Witkin,Demetri Terzopoulos.Snakes: Active contour models[J]. International Journal of Computer Vision . 1988 (4)
  • 5Cootes TF,Edwards GJ,Taylor CJ.Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2001
  • 6Yao Wenjuan,Liang Ya-ling,Du Ming-hui.A Real-time Lip Localization and Tacking for Lip Reading. 2010 the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010) . 2010
  • 7Lewis T W,Powers D M W.Lip feature extractionusing red exclusion. Selected papers from Pan-Sydney Area Workshop on Visual Information Pro-cessing . 2001
  • 8Silsbee PL,Bovik AC.Computer lipreading for improved accuracy in automatic speech recognition. IEEE Transactions on Speech & Audio Processing . 1996
  • 9Lee K D,Lee M J,Lee S Y.Extraction of frame difference features based on PCA and ICA for lipreading. International Joint Conference on Neural Networks . 2005
  • 10Scanlon P,Reilly R B.Feature analysis for automat-ic speechreading. IEEE International Conferenceon Multimedia Signal Processing . 2001

共引文献12

同被引文献35

引证文献2

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部