采用深层神经网络中间层特征的关键词识别被引量：2

Keyword Spotting Based on Deep Neural Networks Bottleneck Feature

下载PDF

导出

摘要在基于模板匹配的关键词识别中,提出采用深层神经网络的中间层特征(bottleneck,BN)作为特征输入,将其取代传统的声学参数来生成后验概率图.首先采用传统语音识别的过程训练一个中间层很窄的深层神经网络,将所有的语音特征经过这个神经网络后得到稳健的BN特征;然后利用混合高斯模型将BN特征转化成后验概率图;在识别过程中,利用后验概率图作为特征参数,采用简化的分段动态时间规整算法实现关键词匹配.在TIMIT数据库上,相对于采用传统感知线性参数的系统,采用BN特征的系统,识别准确率有30%的提升. In this paper, the BN （ bottleneck ） features extracted from DNN （ Deep Neural Networks ） are adopted to replace the tradi-tional acoustic features in template-based Keyword Spotting. Firstly a traditional speech recognition DNN with narrow bottleneck istrained,then the acoustic features are transformed to BN feature through this BN feature extractor. The BN features are fed into aGMM （ Gaussian Mixture Model ） to generate Gaussian posteriorgrams, which will be served as the input of segmental DTW （Dynam-ic Time Warping ）. A language independent keyword spotting experiments are carried in TIMIT corpus. Experimental results demon-strate that the BN features can outperform the conventional PLP （ Perceptual Linear Prediction ） acoustical features, with an absoluterecognition accuracy improvement of 30%.

作者刘学王年松郭武

机构地区安徽省公安厅物证鉴定中心中国科学技术大学电子工程与信息科学系语音及语言信息处理国家工程实验室

出处《小型微型计算机系统》 CSCD 北大核心 2015年第7期1540-1544,共5页 Journal of Chinese Computer Systems

基金安徽省自然科学基金项目(1408085MNL78)资助

关键词识别分段动态时间规整深层神经网络中间层 keyword spotting segmental dynamic time warping deep neural networks bottleneck

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献11

1Szoke I, Motlicek P, Valente F. Improving acoustic based keyword spotting using LVCSR lattices[R]. Idiap,2012.
2Barakat M S.Ritz C H,Stirling D A. Keyword spotting based on the analysis of template matching distances [ C ]. Signal Processing and Communication Systems (ICSPCS) ,2011 5 th International Conference on. IEEE,2011:1 -6.
3Li P,Liang J,Xu B. A novel instance matching based unsupervised keyword spotting system [ C]. IEEE International Conference on Innovative Computing, Information and Control (ICICIC'07) ,2007: 550-550.
4Hazen T J.Shen W, White C. Query-by-example spoken term detection using phonetic posteriorgram templates [ C]. Automatic Speech Recognition & Understanding, ASRU, IEEE Workshop on. IEEE,2009:421-426.
5Park A S,Glass J R. Unsupervised pattern discovery in speech[ J]. Audio, Speech, and Language Processing, IEEE Transactions on, 2008,16(1) :186-197.
6Zhang Y, Glass J R. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams [ C ]. Automatic Speech Recognition & Understanding, ASRU 2009. IEEE Workshop on. IEEE,2009:398-403.
7Candan K S, Rossini R,Wang X,et al. sDTW: computing DTW distances using locally relevant constraints based on salient feature alignments [ J ]. International Conference on Very Large Data Bases (VLDB) Endowment,2012,5(11) ; 1519-1530.
8Gawali B W,Gaikwad S,Yannawar P,et al. Marathi isolated word recognition system using MFCC and DTW features [ J ]. ACEEE International Journal on Information Technology ,2011,1 (1) ;21 -24.
9Dahl G E, Yu D,Deng L,et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition [ J ]. Audi0, Speech, and Language Processing, IEEE Transactions on,2012, 20(1) :30-42.
10Yu D,Seltzer M L. Improved bottleneck features using pretrained deep neural networks[C]. 12th Annual Conference of the International Speech Communication Association ( INTERSPEECH ), Florence, Italy,2011:237-240.

同被引文献5

1吕丹桔,B.Hoffmeister.汉语语音声学特征复合的研究[J].云南大学学报（自然科学版）,2010,32(S1):368-371. 被引量：3
2李冠宇,孟猛.藏语拉萨话大词表连续语音识别声学模型研究[J].计算机工程,2012,38(5):189-191. 被引量：16
3袁胜龙,郭武,戴礼荣.基于深层神经网络的藏语识别[J].模式识别与人工智能,2015,28(3):209-213. 被引量：14
4麦麦提艾力.吐尔逊,戴礼荣.深度神经网络在维吾尔语大词汇量连续语音识别中的应用[J].数据采集与处理,2015,30(2):365-371. 被引量：12
5王辉,赵悦,刘晓凤,徐晓娜,周楠,许彦敏.基于深度特征学习的藏语语音识别[J].东北师大学报（自然科学版）,2015,47(4):69-73. 被引量：8

引证文献2

1刘薇.环境场监控中关键词识别性能研究[J].河南科技,2015,34(9):9-11.
2周楠,赵悦,李要嫱,徐晓娜,才旺拉姆,吴立成.基于瓶颈特征的藏语拉萨话连续语音识别研究[J].北京大学学报（自然科学版）,2018,54(2):249-254. 被引量：9

二级引证文献9

1卓嘎,边巴旺堆.一种藏语连续语音声学特征参数提取算法研究[J].通信技术,2019,52(8):1865-1870. 被引量：3
2卓嘎.基于Praat的藏语连续语音参数提取仿真和分析[J].电子技术与软件工程,2019,0(20):53-56. 被引量：1
3于重重,陈运兵,孙沁瑶,刘畅,徐世璇,尹蔚彬.基于动态BLSTM和CTC的濒危语言语音识别研究[J].计算机应用研究,2019,36(11):3334-3337. 被引量：9
4黄成龙.2018年藏语研究前沿[J].西藏民族大学学报（哲学社会科学版）,2019,40(4):61-69. 被引量：2
5王福钊,周雁.藏语语音识别研究进展和展望[J].计算机系统应用,2020,29(3):29-38. 被引量：2
6郑文秀,赵峻毅,文心怡,姚引娣.基于瓶颈复合特征的声学模型建立方法[J].计算机工程,2020,46(11):301-305. 被引量：3
7张经,杨健,苏鹏.语音识别中单音节识别研究综述[J].计算机科学,2020,47(S02):172-174. 被引量：2
8郑文秀,连晓飞,张旭东,黄琼丹.基于稀疏DNN的声学复合特征构造方法[J].传感器与微系统,2021,40(12):69-72. 被引量：1
9苗瑞霞,张雪兰,谭星浩,方华启.基于RISC-V的神经网络卷积算法的研究与优化[J].计算机工程与设计,2022,43(3):668-676.

1徐毅,殷业,王沛.基于仿生模式识别的语音识别研究[J].上海电机学院学报,2007,10(2):127-130. 被引量：2
2吕波,燕继坤,李建彬,贺苏宁.一种汉语语音关键词检索系统的设计与实现[J].电信技术研究,2006(11):31-34.
3王养廷.面向学生软件开发过程应用实践[J].信息技术,2014,38(5):8-10.
4殷建,殷业.模糊模式识别在语音关键词识别中的应用[J].常州信息职业技术学院学报,2009,8(1):1-3.
5周佳敏.基于部位组合HOG特征的行人检测[J].科技视界,2014(13):199-199.
6刘薇.环境场监控中关键词识别性能研究[J].河南科技,2015,34(9):9-11.
7任莉莉,方元康.基于词汇链与互信息的关键词抽取研究[J].池州学院学报,2013,27(6):48-50. 被引量：1
8左亚尧,龙耀发,李杰骏.中文时间关键词识别研究[J].计算机应用研究,2017,34(4):981-985. 被引量：3
9陈彪,吴成东,郑君刚,韩立伟.基于聚类算法人脸识别方法的研究[J].电子产品世界,2010,17(12):18-20.
10赵娜,杨鸿武.基于关键词识别的语音到手势的转换[J].计算机工程与应用,2016,52(21):146-151. 被引量：1

小型微型计算机系统

2015年第7期

浏览历史

内容加载中请稍等...

采用深层神经网络中间层特征的关键词识别被引量：2

参考文献11

同被引文献5

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

采用深层神经网络中间层特征的关键词识别 被引量：2

参考文献11

同被引文献5

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

采用深层神经网络中间层特征的关键词识别被引量：2