期刊文献+
共找到395篇文章
< 1 2 20 >
每页显示 20 50 100
Analysis of Smooth Cepstral Peak Prominence in Hypokinetic Dysarthria Associated With Parkinson’s Disease
1
作者 Qiang LI Abigail WALLACE +4 位作者 Wesley DAVIS Beau ROTH Laura LANGHOFER Shalini NARAYANA Michael CANNITO 《Chinese Journal of Applied Linguistics》 2024年第4期657-669,688,共14页
Smoothed cepstral peak prominence(CPPs)is a measurement of the distance from the prominent cepstral peak to the linear regression line directly beneath it.Variations of CPPs data acquisition and analysis lead to the c... Smoothed cepstral peak prominence(CPPs)is a measurement of the distance from the prominent cepstral peak to the linear regression line directly beneath it.Variations of CPPs data acquisition and analysis lead to the complexity of the clinical cut-off values,and there are no agreeable values for a specific voice disorder,such as hypokinetic dysarthria associated with Parkinson’s disease(PD).This study examined the CPPs in people with hypokinetic dysarthria associated with PD compared with healthy participants.Results demonstrated significant differences in speech tasks of sustained vowel and connected speech,with CPPs of connected speech more sensitive to dysphonia and gender difference in PD participants.Males in PD participants presented higher CPPs for sustained vowels and lower CPPs for connected speech than females.It is implied that a consistent clinical application protocol is necessary,and multiple acoustic measures are needed to ensure the accuracy of clinical decisions. 展开更多
关键词 cepstral peak prominence hypokinetic dysarthria VOICE Parkinson’s disease motor speech disorders
下载PDF
Modified Cepstral Feature for Speech Anti-spoofing
2
作者 何明瑞 ZAIDI Syed Faham Ali +3 位作者 田娩鑫 单志勇 江政儒 徐珑婷 《Journal of Donghua University(English Edition)》 CAS 2023年第2期193-201,共9页
The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identifica... The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack. 展开更多
关键词 spoofed speech detection log magnitude spectrum linear frequency cepstral coefficient(LFCC) hand-crafted feature
下载PDF
Multisource Target Classification Based on Underwater Channel Cepstral Features
3
作者 LI Xiukun JIA Hongjian +1 位作者 DONG Jianwei QIN Jixing 《Journal of Ocean University of China》 SCIE CAS CSCD 2022年第4期917-925,共9页
Passive target detection through shipping-radiated noise is a key technology in current underwater operations and is of great research value in civil and military fields.In this study,the stable spectral line componen... Passive target detection through shipping-radiated noise is a key technology in current underwater operations and is of great research value in civil and military fields.In this study,the stable spectral line component of shipping-radiated noise is used as the research object,and the classification of multisource targets is studied from the perspective of underwater channels.We utilize the channel impulse response function as the classification basis of different targets.First,the underwater channel is estimated by the cepstrum.Then,the channel cepstral features carried by different spectral line components are extracted in turn.Finally,the spectral line components belonging to the same target are clustered by the cepstral feature distance to realize the classification of different targets.The simulation and experimental results verify the effectiveness of the proposed method in this research. 展开更多
关键词 shipping-radiated noise underwater channel cepstral features target classification
下载PDF
Speech Intelligibility Enhancement Algorithm Based on Multi-Resolution Power-Normalized Cepstral Coefficients(MRPNCC)for Digital Hearing Aids
4
作者 Xia Wang Xing Deng +2 位作者 Hongming Shen Guodong Zhang Shibing Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第2期693-710,共18页
Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great pro... Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great promise for improving speech intelligibility.Two key issues of these approaches are acoustic features extracted from noisy signals and classifiers used for supervised learning.In this paper,features are focused.Multi-resolution power-normalized cepstral coefficients(MRPNCC)are proposed as a new feature to enhance the speech intelligibility for hearing impaired.The new feature is constructed by combining four cepstrum at different time–frequency(T–F)resolutions in order to capture both the local and contextual information.MRPNCC vectors and binary masking labels calculated by signals passed through gammatone filterbank are used to train support vector machine(SVM)classifier,which aim to identify the binary masking values of the T–F units in the enhancement stage.The enhanced speech is synthesized by using the estimated masking values and wiener filtered T–F unit.Objective experimental results demonstrate that the proposed feature is superior to other comparing features in terms of HIT-FA,STOI,HASPI and PESQ,and that the proposed algorithm not only improves speech intelligibility but also improves speech quality slightly.Subjective tests validate the effectiveness of the proposed algorithm for hearing impaired. 展开更多
关键词 Speech intelligibility enhancement multi-resolution power-normalized cepstral coefficients binary masking value hearing impaired
下载PDF
Seismic Edge Detection by Application of Cepstral Decomposition to Data Driven Modeled Geologic Channel Feature in Niger Delta
5
作者 Orji,O.M. Ugwu,S.A. Ofuyah,W.N. 《Journal of Geological Research》 2020年第2期1-10,共10页
Seismic edge detection algorithm unmasks blurred discontinuity in an image and its efficiency is dependent on the precession of the processing scheme adopted.Data-driven modeling is a fast machine learning scheme and ... Seismic edge detection algorithm unmasks blurred discontinuity in an image and its efficiency is dependent on the precession of the processing scheme adopted.Data-driven modeling is a fast machine learning scheme and a formal automatic version of the empirical approach in existence for a long time and which can be used in many different contexts.Here,a desired algorithm that can identify masked connection and correlation from a set of observations is built and used.Geologic models of hydrocarbon reservoirs facilitate enhanced visualization,volumetric calculation,well planning and prediction of migration path for fluid.In order to obtain new insights and test the mappability of a geologic feature,spectral decomposition techniques i.e.Discrete Fourier Transform(DFT),etc and Cepstral decomposition techniques,i.e Complex Cepstral Transform(CCT),etc can be employed.Cepstral decomposition is a new approach that extends the widely used process of spectral decomposition which is rigorous when analyzing very subtle stratigraphic plays and fractured reservoirs.This paper presents the results of the application of DFT and CCT to a two dimensional,50Hz low impedance Channel sand model,representing typical geologic environment around a prospective hydrocarbon zone largely trapped in various types of channel structures.While the DFT represents the frequency and phase spectra of a signal,assumes stationarity and highlights the average properties of its dominant portion,assuming analytical,the CCT represents the quefrency and saphe cepstra of a signal in quefrency domain.The transform filters the field data recorded in time domain,and recovers lost sub-seismic geologic information in quefrency domain by separating source and transmission path effects.Our algorithm is based on fast Fourier transform(FFT)techniques and the programming code was written within Matlab software.It was developed from first principles and outside oil industry’s interpretational platform using standard processing routines.The results of the algorithm,when implemented on both commercial and general platforms,were comparable.The cepstral properties of the channel model indicate that cepstral attributes can be utilized as powerful tool in exploration problems to enhance visualization of small scale anomalies and obtain reliable estimates of wavelet and stratigraphic parameters.The practical relevance of this investigation is illustrated by means of sample results of spectral and cepstral attribute plots and pseudo-sections of phase and saphe constructed from the model data.The cepstral attributes reveal more details in terms of quefrency required for clearer imaging and better interpretation of subtle edges/discontinuities,sand-shale interbedding,differences in lithology.These positively impact on production as they serve as basis for the interpretation of similar geologic situations in field data. 展开更多
关键词 Complex cepstral Transform Fourier transform Gamnitude Quefrency Saphe
下载PDF
基于融合特征ADRMFCC的语音识别方法 被引量:1
6
作者 朵琳 马建 +1 位作者 韦贵香 唐剑 《吉林大学学报(理学版)》 CAS 北大核心 2024年第4期943-950,共8页
针对在复杂噪声环境下语音识别准确率低和鲁棒性差的问题,提出一种基于增减残差Mel倒谱融合特征的语音识别方法.该方法首先利用增减分量法筛选关键语音特征,然后将其映射到Mel域-残差域空间坐标系中生成增减残差Mel倒谱系数,最后将这些... 针对在复杂噪声环境下语音识别准确率低和鲁棒性差的问题,提出一种基于增减残差Mel倒谱融合特征的语音识别方法.该方法首先利用增减分量法筛选关键语音特征,然后将其映射到Mel域-残差域空间坐标系中生成增减残差Mel倒谱系数,最后将这些融合特征用于训练端到端模型.实验结果表明,该方法在不同噪声类型和信噪比条件下均显著提高了语音识别准确率及性能,在-5 dB低信噪比条件下,语音识别准确率达73.13%,而在其他噪声条件下的平均语音识别准确率达88.67%,充分证明了该方法的有效性和鲁棒性. 展开更多
关键词 语音识别 残差Mel倒谱系数 特征筛选 增减分量法
下载PDF
试验环境水下声信号的特征提取方法
7
作者 王红滨 王永乐 +1 位作者 何鸣 薛垚 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2024年第3期489-495,共7页
水下试验环境参数的反演是水声学研究领域的重要内容。而当前研究的关键是通过对水下声信号做特征提取从而获取参数信息。针对特征提取较难、模型很难拟合等问题。本文提出了一种试验环境水下声信号的特征提取方法。将水下声信号同时用... 水下试验环境参数的反演是水声学研究领域的重要内容。而当前研究的关键是通过对水下声信号做特征提取从而获取参数信息。针对特征提取较难、模型很难拟合等问题。本文提出了一种试验环境水下声信号的特征提取方法。将水下声信号同时用梅尔频谱倒谱系数及线性预测系数处理,两者运用特征加权组合方法得到新的特征矩阵;再应用映射插值算法对特征矩阵进行处理,获得适应神经网络输入的三通道矩阵。本文选取的网络模型为残差神经网络。利用实验室所录制的对河口水库数据集测试表明,本文提出的特征提取方法普遍优于仅利用梅尔频谱倒谱系数或线性预测系数的特征处理方法。利用单频矩形脉冲信号对环境进行深度5分类,准确率平均提升2%。利用线性调频信号对环境进行深度5分类,准确率平均提升2.03%。本文提出的特征提取方法对线性调频信号在深度分类任务下处理的结果要优于单频矩形脉冲信号处理的结果。 展开更多
关键词 环境反演 特征提取 梅尔频谱倒谱系数 线性预测系数 特征加权组合方法 残差神经网络 神经网络 水下声信号
下载PDF
基于声信号的离心泵故障诊断研究
8
作者 陈剑 姜涛 陈品 《电子测量与仪器学报》 CSCD 北大核心 2024年第5期169-177,共9页
各种原因使得工业现场设备状态监测的首选测量信号是声信号时,提出一种基于声信号的设备状态监测方法显得尤为必要。以某型离心泵为依据对象,对现场采集的声信号提取梅尔倒谱系数(MFCC)作为信号的初始特征,然后计算这些MFCC初始特征的... 各种原因使得工业现场设备状态监测的首选测量信号是声信号时,提出一种基于声信号的设备状态监测方法显得尤为必要。以某型离心泵为依据对象,对现场采集的声信号提取梅尔倒谱系数(MFCC)作为信号的初始特征,然后计算这些MFCC初始特征的散布熵(DE)值,并通过主成分分析法(PCA)对矩阵进行降维,从而构造特征矩阵。利用蝙蝠优化算法(BA)对支持向量机(SVM)的惩罚系数与核函数参数进行优化,对离心泵的多种故障工况开展诊断,并与多种诊断方法进行比较。实验结果表明,经过BA优化后的模型在诊断准确率上提高了21.7%;在该模型的基础上利用DE对MFCC提取的信号进行深度挖掘,使模型诊断的准确率提高2.05%。 展开更多
关键词 离心泵故障诊断 声信号 梅尔倒谱散布熵 蝙蝠优化算法 支持向量机
下载PDF
Comprehensive Analysis of Gender Classification Accuracy across Varied Geographic Regions through the Application of Deep Learning Algorithms to Speech Signals
9
作者 Abhishek Singhal Devendra Kumar Sharma 《Computer Systems Science & Engineering》 2024年第3期609-625,共17页
This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysi... This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model. 展开更多
关键词 Deep learning recurrent neural network voice signal mel frequency cepstral coefficients geographical area GENDER
下载PDF
Research on blind source separation of operation sounds of metro power transformer through an Adaptive Threshold REPET algorithm
10
作者 Liang Chen Liyi Xiong +2 位作者 Fang Zhao Yanfei Ju An Jin 《Railway Sciences》 2024年第5期609-621,共13页
Purpose–The safe operation of the metro power transformer directly relates to the safety and efficiency of the entire metro system.Through voiceprint technology,the sounds emitted by the transformer can be monitored ... Purpose–The safe operation of the metro power transformer directly relates to the safety and efficiency of the entire metro system.Through voiceprint technology,the sounds emitted by the transformer can be monitored in real-time,thereby achieving real-time monitoring of the transformer’s operational status.However,the environment surrounding power transformers is filled with various interfering sounds that intertwine with both the normal operational voiceprints and faulty voiceprints of the transformer,severely impacting the accuracy and reliability of voiceprint identification.Therefore,effective preprocessing steps are required to identify and separate the sound signals of transformer operation,which is a prerequisite for subsequent analysis.Design/methodology/approach–This paper proposes an Adaptive Threshold Repeating Pattern Extraction Technique(REPET)algorithm to separate and denoise the transformer operation sound signals.By analyzing the Short-Time Fourier Transform(STFT)amplitude spectrum,the algorithm identifies and utilizes the repeating periodic structures within the signal to automatically adjust the threshold,effectively distinguishing and extracting stable background signals from transient foreground events.The REPET algorithm first calculates the autocorrelation matrix of the signal to determine the repeating period,then constructs a repeating segment model.Through comparison with the amplitude spectrum of the original signal,repeating patterns are extracted and a soft time-frequency mask is generated.Findings–After adaptive thresholding processing,the target signal is separated.Experiments conducted on mixed sounds to separate background sounds from foreground sounds using this algorithm and comparing the results with those obtained using the FastICA algorithm demonstrate that the Adaptive Threshold REPET method achieves good separation effects.Originality/value–A REPET method with adaptive threshold is proposed,which adopts the dynamic threshold adjustment mechanism,adaptively calculates the threshold for blind source separation and improves the adaptability and robustness of the algorithm to the statistical characteristics of the signal.It also lays the foundation for transformer fault detection based on acoustic fingerprinting. 展开更多
关键词 TRANSFORMER Voiceprint recognition Blind source separation Mel frequency cepstral coefficients(MFCC) Adaptive threshold
下载PDF
基于多尺度时序感知网络的课堂语音情感识别方法
11
作者 周菊香 刘金生 +2 位作者 甘健侯 吴迪 李子杰 《计算机应用》 CSCD 北大核心 2024年第5期1636-1643,共8页
语音情感识别近年来在多场景智能系统中得到了广泛应用,也为实现智慧课堂环境下的教学行为智能分析提供了可能。通过课堂语音情感识别技术可以自动识别课堂教学中教师和学生的情感状态,帮助教师了解自己的授课风格并及时掌握学生的课堂... 语音情感识别近年来在多场景智能系统中得到了广泛应用,也为实现智慧课堂环境下的教学行为智能分析提供了可能。通过课堂语音情感识别技术可以自动识别课堂教学中教师和学生的情感状态,帮助教师了解自己的授课风格并及时掌握学生的课堂学习状态,从而达到精准施教的目的。针对课堂语音情感识别任务,首先,收集中小学的课堂实录教学视频,提取音频并进行人工切分和标注,构建了包含6类情感的中小学教学语音情感语料库;其次,基于时序卷积网络(TCN)和交叉门控机制(cross-gated mechanism)设计了双路时序卷积通道,以提取多尺度交叉融合特征;最后,采用动态权重融合策略调整不同尺度特征的贡献度,减少非重要特征对识别结果的干扰,进一步增强模型的表征和学习能力。实验结果表明,所提方法在多个公共数据集上优于TIM-Net(Temporal-aware bI-direction Multi-scaleNetwork)、GM-TCNet(Gated Multi-scale Temporal Convolutional Network)和CTL-MTNet(CapsNet and Transfer Learning-based Mixed Task Net)等先进模型,在真实课堂语音情感识别任务上未加权平均召回率(UAR)和加权平均召回率(WAR)分别达90.58%和90.45%。 展开更多
关键词 语音情感识别 课堂语音 时序卷积网络 交叉门控卷积 梅尔频率倒谱系数
下载PDF
结合MGCC特征与多尺度通道注意力的环境声深度学习分类方法
12
作者 杨俊杰 丁家辉 +2 位作者 杨柳 冯丽 杨超 《应用声学》 CSCD 北大核心 2024年第3期513-524,共12页
环境声分类技术在家居安全监测、人机语声交互等领域具有关键作用。然而,声源的多样性与混合性给环境声分类方法设计带来了重大挑战。为提高分类准确率与节约计算资源,该文提出一种基于多尺度通道注意力机制的深度学习分类模型。所提模... 环境声分类技术在家居安全监测、人机语声交互等领域具有关键作用。然而,声源的多样性与混合性给环境声分类方法设计带来了重大挑战。为提高分类准确率与节约计算资源,该文提出一种基于多尺度通道注意力机制的深度学习分类模型。所提模型由特征提取模块、多尺度卷积模块、高效通道注意力模块、输出层四部分组成。首先,通过引入加权型梅尔Gammatone频率倒谱系数(MGCC)挖掘环境声频谱幅值与相位结构信息;其次,融合多尺度卷积核与高效通道注意力机制优选出声频关键局部细节和通道特征;最后,在全连接层采用softmax函数映射特征并输出环境声类型的概率值。所提模型在6种环境声的iFLYTEK、10种环境声的Urbansound8k数据集上开展测试验证,分别取得了94%、76.52%、79.24%(iFLYTEK+Urbansound8k)的分类准确率。消融实验结果进一步表明:引入的多尺度卷积模块、通道注意力机制模块对分类准确率的提升贡献率分别接近于3.77%和1.89%。实验还详细对比了7种现有的深度学习分类方法,所提算法在分类准确率上排名第二;另外,在同级别算法中如ResNet18、GoogLeNet,所提算法在模型参数量和计算复杂度方面上实现了进一步的约减。 展开更多
关键词 环境声分类 梅尔Gammatone频率倒谱 多尺度核卷积 高效通道注意力 卷积神经网络
下载PDF
基于MFCC-IMFCC混合倒谱的托辊轴承故障诊断
13
作者 陶瀚宇 陈换过 +2 位作者 彭程程 高祥冲 杨磊 《机电工程》 CAS 北大核心 2024年第7期1215-1222,共8页
针对梅尔倒谱系数(MFCC)对托辊轴承高频特征提取能力不足的问题,提出了一种基于梅尔倒谱系数和翻转梅尔倒谱系数(MFCC-IMFCC)的混合倒谱以及长短时记忆(LSTM)网络的托辊轴承故障诊断方法。首先,分析了三种状态下的托辊声音信号,明确了... 针对梅尔倒谱系数(MFCC)对托辊轴承高频特征提取能力不足的问题,提出了一种基于梅尔倒谱系数和翻转梅尔倒谱系数(MFCC-IMFCC)的混合倒谱以及长短时记忆(LSTM)网络的托辊轴承故障诊断方法。首先,分析了三种状态下的托辊声音信号,明确了托辊轴承故障信息主要分布在中高频区域;然后,为有效保留高频信息,提取了MFCC-IMFCC,以帧级串联的方式组成了混合倒谱特征;最后,将混合倒谱特征输入到双层LSTM模型中进行了训练,建立了托辊轴承故障诊断模型。研究结果表明:针对托辊正常、滚动体故障和偏心旋转故障三种状态,LSTM结合混合倒谱特征的平均识别准确率达到96.72%,相比于单一的MFCC和IMFCC特征,准确率分别提升3.94%和7.41%,凸显了混合倒谱特征在表征托辊轴承故障信息方面的显著优势。 展开更多
关键词 托辊轴承 轴承故障声音信号 高频信息 梅尔倒谱系数 翻转梅尔倒谱系数 混合倒谱系数 长短时记忆网络
下载PDF
基于脉搏波频域梅尔频率倒谱系数特征的高血压危险分层预测模型
14
作者 齐晨浩 杨晶东 +2 位作者 邱泽浩 尧明慧 燕海霞 《海军军医大学学报》 CAS CSCD 北大核心 2024年第10期1226-1240,共15页
目的 为改进基于人工智能技术高血压时域脉搏波分类模型精度低、泛化性能差的问题,提出一种基于融合注意力机制的频域脉搏波预测模型。方法 首先将时域脉搏波转换为频域梅尔频率倒谱系数特征,增强脉搏波区分度,采用时间卷积网络与Transf... 目的 为改进基于人工智能技术高血压时域脉搏波分类模型精度低、泛化性能差的问题,提出一种基于融合注意力机制的频域脉搏波预测模型。方法 首先将时域脉搏波转换为频域梅尔频率倒谱系数特征,增强脉搏波区分度,采用时间卷积网络与Transformer 结构提取脉搏波深层特征,并将自注意力机制与选择性内核注意力进行决策融合,提取脉搏波关联特征,并采用Floodings正则化方法间接控制训练损失,防止过拟合发生。针对上海中医药大学附属龙华医院及上海市中西医结合医院提供的527例临床脉诊数据,进行5折交叉验证实验。此外,采用梯度提升决策树算法统计脉搏波频域特征的贡献率排名,分析影响模型分类精度的关键因素,为中医临床辅助诊断提供参考价值。结果 本研究提出的模型分类评估指标准确度、F1值、精确率、召回率和AUC值分别为0.939 6、0.924 9、0.940 9、0.929 5和0.993 4。脉搏波的静态特征、一阶差分和二阶差分系数的贡献率相对均衡,说明高血压危险程度不仅与脉搏波的静态特征相关,也应当考虑脉搏波的动态特征。结论 与典型脉搏波分类模型相比,本研究提出的模型具有较高的分类精度和泛化性能。 展开更多
关键词 高血压 危险分层 梅尔频率倒谱系数 时间卷积网络 TRANSFORMER
下载PDF
改进变值逻辑与线性预测在心音分类中的应用
15
作者 王彦麟 孙静 +3 位作者 杨宏波 郭涛 潘家华 王威廉 《云南大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第3期432-442,共11页
心音对于评价心脏健康状况具有重要作用.文章介绍了一种新的基于变值逻辑与线性预测倒谱系数融合特征的先心病分类算法,有助于提取心音中的深度病理特征.算法首先对心音进行降噪、包络提取;然后进行变值逻辑运算、标记并转换为可分析的... 心音对于评价心脏健康状况具有重要作用.文章介绍了一种新的基于变值逻辑与线性预测倒谱系数融合特征的先心病分类算法,有助于提取心音中的深度病理特征.算法首先对心音进行降噪、包络提取;然后进行变值逻辑运算、标记并转换为可分析的测度数据,并计算信号的线性预测倒谱系数进行特征融合;最后使用随机森林,XGBOOST和LIGHTGBM机器学习分类器进行先心病二分类.研究所用心音样本共4000例,测试结果对正常和异常心音分类的平均准确率为0.9138.算法无需对心音进行心动周期分割,大大简化了分析流程,可望用于先心病的筛查. 展开更多
关键词 心音 先心病 3比特编码变值逻辑 线性预测倒谱系数 特征融合
下载PDF
基于DBN的液压泵劣化程度评估方法研究
16
作者 李振宝 伊明 +2 位作者 李富强 张磊 姜万录 《机床与液压》 北大核心 2024年第14期219-226,共8页
针对轴向柱塞泵中心弹簧失效故障难以有效评估的问题,提出一种基于梅尔频率倒谱系数(MFCC)和深度信念神经网络(DBN)的液压泵劣化程度评估方法。对现场采集的正常数据和3种不同程度中心弹簧失效故障的液压泵振动信号进行信号预处理,包括... 针对轴向柱塞泵中心弹簧失效故障难以有效评估的问题,提出一种基于梅尔频率倒谱系数(MFCC)和深度信念神经网络(DBN)的液压泵劣化程度评估方法。对现场采集的正常数据和3种不同程度中心弹簧失效故障的液压泵振动信号进行信号预处理,包括预加重、分帧和加窗等;对预处理后的信号进行快速傅里叶变换(FFT),得到其频率谱和功率谱,然后让其通过Mel滤波器组,得到信号的对数能量;最后对对数能量进行离散余弦变换,得到信号的倒谱系数和一阶差分系数,并以此构成特征向量。基于DBN方法搭建深度学习模型,对特征向量进行学习,将测试样本导入深度学习模型,对中心弹簧失效程度进行评估,并将倒谱系数和一阶差分系数的识别结果进行对比。结果表明:当选择倒谱系数为特征向量时,具有较高的识别精度,能够有效识别轴向柱塞泵中心弹簧的性能劣化程度。 展开更多
关键词 梅尔频率倒谱系数 深度信念神经网络 轴向柱塞泵 劣化评估
下载PDF
使用全局自注意Teager能量倒谱系数检测重放欺骗语音
17
作者 陈铭 陈雪勤 《声学学报》 EI CAS CSCD 北大核心 2024年第5期1122-1130,共9页
提出了一种基于能量的前端特征提取方法,旨在应对自动说话人验证系统中面临的重放攻击威胁。该方法实现了全频段上的可变分辨率,以充分利用重放语音与真实语音在子带能量上的高鉴别非线性信息。首先,通过采用F-ratio方法统计分析了多种... 提出了一种基于能量的前端特征提取方法,旨在应对自动说话人验证系统中面临的重放攻击威胁。该方法实现了全频段上的可变分辨率,以充分利用重放语音与真实语音在子带能量上的高鉴别非线性信息。首先,通过采用F-ratio方法统计分析了多种录音和播放设备。接着,根据统计结果在全频段上设计了一组滤波器,旨在捕获高鉴别能量信息。最后,利用Teager能量算子计算子带滤波信号的能量,提出了全局自注意Teager能量倒谱系数(GSTECC)。为了验证所提方法的有效性,采用高斯混合模型作为分类器,在ASVspoof 2017 V2和ASVspoof 2021 PA数据库上进行了一系列测试实验。实验结果表明,相对于其他先进特征提取方法,所提GSTECC特征在检测重放攻击方面表现出更优异的性能。 展开更多
关键词 说话人验证 重放攻击检测 全局自注意特征 Teager 能量倒谱系数 非线性滤波器组
下载PDF
砂岩破裂状态声发射梅尔倒谱系数判识方法
18
作者 何学秋 杨菲 +5 位作者 李振雷 李娜 宋大钊 王洪磊 SOBOLEV Aleksei RASSKAZOV Igor 《煤炭学报》 EI CAS CSCD 北大核心 2024年第2期753-766,共14页
岩体结构破裂是严重制约矿山、地铁、隧道等地下空间工程建设及其安全运行的重要因素。实现对岩体结构破裂状态的识别是当下研究的热点与重点之一。为此,开展了不同条件的砂岩加载破坏实验,提取了加载全程的声发射梅尔倒谱系数及其波动... 岩体结构破裂是严重制约矿山、地铁、隧道等地下空间工程建设及其安全运行的重要因素。实现对岩体结构破裂状态的识别是当下研究的热点与重点之一。为此,开展了不同条件的砂岩加载破坏实验,提取了加载全程的声发射梅尔倒谱系数及其波动差,研究了系数及其波动差在砂岩受载破坏全程的变化规律,分析了1号系数(一组声发射梅尔倒谱系数包括12个,1号系数指第1个声发射梅尔倒谱系数)及其波动差与砂岩破裂状态的相关性特征,基于此提出了砂岩破裂状态声发射梅尔倒谱系数判识方法,构建了判识准则并进行判识效果检验。结果表明:随载荷增加,1号系数整体上增大,系数及其离散性在破坏阶段显著增大并表现出显著的规律波动性特征;1号系数波动差具有阶段性变化特征,波动差的大小及其起伏变化可表征砂岩的破裂,波动差整体增大及突增的变化可反映砂岩非稳定变形和峰后破坏阶段的宏观破裂,波动差的突增幅度可反映砂岩破裂程度;声发射梅尔倒谱系数及其波动差对砂岩破裂表现出良好的响应特征,该特征受不同加载条件的影响较小,说明声发射梅尔倒谱系数在反映砂岩破裂上具有适用性;1号系数及其波动差与砂岩破裂状态具有较好相关性,该相关性可分为3个阶段,即1号系数及其波动差在砂岩微破裂阶段分布集中,在临近失稳破坏阶段分布范围急剧增大、整体值升高且出现高异常值,在峰后破坏阶段分布范围进一步增大、整体值更高、高异常值更多;利用1号系数的75%位点值和异常值、1号系数波动差的75%位点值和异常值构建了砂岩破裂状态判识准则,采用三分类模型混淆矩阵对判识准则的效果进行了检验,判识准确度和精准度分别为90.43%、94.45%。该成果可为其他种类煤岩的破裂状态识别提供借鉴,为煤岩失稳监测预警提供参考。 展开更多
关键词 砂岩破裂状态 声发射 梅尔倒谱系数 判识方法
下载PDF
MFCC特征训练技术在声纹识别中的应用 被引量:1
19
作者 陶雨昂 《集成电路应用》 2024年第2期386-387,共2页
阐述MFCC声纹特征提取的原理、MFCC特征提取模式和基于MFCC声纹识别的实现。提取模式包括MFCC提取流程、短时傅立叶变换STFT、梅尔滤波器组的构造、离散余弦变换(DCT)与MFCC特征值的提取。针对融合特征提取方案可分性与鲁棒性的缺陷提... 阐述MFCC声纹特征提取的原理、MFCC特征提取模式和基于MFCC声纹识别的实现。提取模式包括MFCC提取流程、短时傅立叶变换STFT、梅尔滤波器组的构造、离散余弦变换(DCT)与MFCC特征值的提取。针对融合特征提取方案可分性与鲁棒性的缺陷提出改进方案。 展开更多
关键词 模式识别 频率倒谱 特征提取 音频信息
下载PDF
基于特征融合与注意力机制的鸟类声纹识别方法
20
作者 潘齐炜 程吉祥 +2 位作者 田甜 吴丹 曾蕊 《声学技术》 CSCD 北大核心 2024年第5期686-695,共10页
鸟类声纹识别技术是一种将经过预处理的多种鸟类声音作为输入,通过网络模型识别出相应鸟类的技术。针对真实环境下鸟类声纹识别中单一音频特征局限和模型学习特征能力不佳问题,文章提出了一种基于特征融合和注意力机制的鸟类声纹识别方... 鸟类声纹识别技术是一种将经过预处理的多种鸟类声音作为输入,通过网络模型识别出相应鸟类的技术。针对真实环境下鸟类声纹识别中单一音频特征局限和模型学习特征能力不佳问题,文章提出了一种基于特征融合和注意力机制的鸟类声纹识别方法。首先,在特征提取时分别获取梅尔频率倒谱系数和功率正则化倒谱系数,其次利用均值和方差归一化处理将两种特征融合得到新型融合特征参数MPFC;然后,以ResNet-50为主干网络在其残差模块中引入轻量化坐标注意力机制得到改进网络模型—坐标注意力残差网络;最后,将融合特征分别输入到坐标注意力残差网络(residual coordinate attention net, ResCA),ResNet-50、ResNeSt-50、DenseNet-121和EfficientNet-B0并在两个数据集Birdsdata和BirdCLEF上进行对比实验。实验结果表明,融合特征比单一特征有更好的表征能力,能够提高一定识别率,改进网络也具有较好的识别效果。 展开更多
关键词 鸟类声纹识别 特征融合 梅尔频率倒谱系数 功率正则化倒谱系
下载PDF
上一页 1 2 20 下一页 到第
使用帮助 返回顶部