摘要
介绍了一种新型的利用梅尔频率倒谱系数(MFCC)提取和动态时间规整技术(DTW)优化的连续音频识别算法。首先对数学原理与算法步骤进行设计与规划,使用大规模音频数据库进行预处理,经过时域和频域分析提取相应的特征;然后利用双门限法把连续音频切分为不同的音频块,并对切分部分进行针对性识别,将其与时频域数据库的模板进行匹配比对,实现了较好的连续音频识别效果,在时域和频域识别上的准确性均能达到89%。该研究成果可应用于钢琴教学系统的开发,尤其是在辅助学习者正确弹出曲谱方面具有广阔的应用前景。
This paper introduces a new continuous audio recognition algorithm based on Mel Frequency Cepstral Coefficents(MFCC) extraction and Dynamic Time Warping(DTW) optimization.Firstly,the mathematical principles and algorithm steps are planned and designed.With the large-scale audio database preprocessed,the corresponding features are extracted through time domain and frequency domain analysis.Then,continuous audio is segmented into different audio blocks by double threshold method,and the segmented part is identified pertinently,and the template of time-frequency domain database is matched and compared to achieve a better continuous audio recognition effect.In the time domain and frequency domain recognition can reach 89% accuracy.The research results can be applied to the development of piano teaching system,especially in assisting learners to correctly play music,which has broad application prospects.
作者
王鸿瑞
张玉辰
陈鹭
高博韬
高昕悦
Wang Hongrui;Zhang Yuchen;Chen Lu;Gao Botao;Gao Xinyue(Xi'an Jiaotong University,Xi'an,710049,China)
出处
《中国现代教育装备》
2024年第17期41-45,52,共6页
China Modern Educational Equipment
关键词
语音识别
端点检测
梅尔频率倒谱系数
动态时间规整算法
时频域分析
speech recognition
endpoint detection
mel rrequency cepstral coefficents
dynamic time warping algorithm
time-frequency domain analysis