期刊文献+
共找到226篇文章
< 1 2 12 >
每页显示 20 50 100
VOICE ACTIVITY DETECTION UNDER RAYLEIGH DISTRIBUTION 被引量:1
1
作者 Li Yu Chen Jianming Tan Hongzhou 《Journal of Electronics(China)》 2009年第4期552-556,共5页
This paper presents an improved Voice Activity Detection (VAD) algorithm which uses the Signal-to-Noise Ratio (SNR) measure. We assume that noise Power Spectral Density (PSD) in each spectral bin follows a Rayle... This paper presents an improved Voice Activity Detection (VAD) algorithm which uses the Signal-to-Noise Ratio (SNR) measure. We assume that noise Power Spectral Density (PSD) in each spectral bin follows a Rayleigh distribution. Rayleigh distributions with its asymmetric tail characteristics give a better description of the noise PSD distribution than Gaussian distribution. Under this asstlmption, a new threshold updating expression is derived. Since the analytical integral of the false alarm probability, the threshold updating expression can be represented without the inverse complementary error function and low computational complexity is achieved in our system. Experimental results show that the proposed VAD outperforms or at least is comparable with the VAD scheme presented by Davis under several noise environments and has a lower computational complexity. 展开更多
关键词 Statistical voice activity detection (vad Threshold update Rayleigh distribution Computational complexity
下载PDF
IMPROVING VOICE ACTIVITY DETECTION VIA WEIGHTING LIKELIHOOD AND DIMENSION REDUCTION
2
作者 Wang Huanliang Han Jiqing Li Haifeng Zheng Tieran 《Journal of Electronics(China)》 2008年第3期330-336,共7页
The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for... The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for noise-robust VAD. The contribution of dynamic features to likelihood score can be increased via the method, which improves consequently the noise robustness of VAD. Divergence based dimension reduction method is proposed for saving computation, which reduces these feature dimensions with smaller divergence value at the cost of degrading the performance a little. Experimental results on Aurora Ⅱ database show that the detection performance in noise environments can remarkably be improved by the proposed method when the model trained in clean data is used to detect speech endpoints. Using weighting likelihood on the dimension-reduced features obtains comparable, even better, performance compared to original full-dimensional feature. 展开更多
关键词 voice activity detection (vad Weighting likelihood DIVERGENCE Dimension reduction Noise robustness
下载PDF
Voice activity detection based on deep belief networks using likelihood ratio 被引量:3
3
作者 KIM Sang-Kyun PARK Young-Jin LEE Sangmin 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第1期145-149,共5页
A novel technique is proposed to improve the performance of voice activity detection(VAD) by using deep belief networks(DBN) with a likelihood ratio(LR). The likelihood ratio is derived from the speech and noise spect... A novel technique is proposed to improve the performance of voice activity detection(VAD) by using deep belief networks(DBN) with a likelihood ratio(LR). The likelihood ratio is derived from the speech and noise spectral components that are assumed to follow the Gaussian probability density function(PDF). The proposed algorithm employs DBN learning in order to classify voice activity by using the input signal to calculate the likelihood ratio. Experiments show that the proposed algorithm yields improved results in various noise environments, compared to the conventional VAD algorithms. Furthermore, the DBN based algorithm decreases the detection probability of error with [0.7, 2.6] compared to the support vector machine based algorithm. 展开更多
关键词 voice activity detection likelihood ratio deep belief networks
下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
4
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
下载PDF
Sequential Tests for the Detection of Voice Activity and the Recognition of Cyber Exploits
5
作者 Ehab Etellisi P. Papantoni-Kazakos 《Communications and Network》 2011年第4期185-199,共15页
We consider the problem of automated voice activity detection (VAD), in the presence of noise. To attain this objective, we introduce a Sequential Detection of Change Test (SDCT), designed at the independent mixture o... We consider the problem of automated voice activity detection (VAD), in the presence of noise. To attain this objective, we introduce a Sequential Detection of Change Test (SDCT), designed at the independent mixture of Laplacian and Gaussian distributions. We analyse and numerically evaluate the proposed test for various noisy environments. In addition, we address the problem of effectively recognizing the possible presence of cyber exploits in the voice transmission channel. We then introduce another sequential test, designed to detect rapidly and accurately the presence of such exploits, named Cyber Attacks Sequential Detection of Change Test (CA-SDCT). We analyse and numerically evaluate the latter test. Experimental results and comparisons with other proposed methods are also presented. 展开更多
关键词 voice activity detection SEQUENTIAL detection of CHANGE Test CYBER Exploits
下载PDF
Speech detection method based on a multi-window analysis 被引量:1
6
作者 Luo Xinwei Liu Ting +4 位作者 Huang Ming Xu Xiaogang Cao Hongli Bai Xianghua Xu Dayong 《Journal of Southeast University(English Edition)》 EI CAS 2021年第4期343-349,共7页
Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram o... Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages. 展开更多
关键词 voice activity detection multi-window spectral analysis K-means clustering threshold adjustment sequential decision
下载PDF
Audio-visual voice activity detection 被引量:1
7
作者 LIU Peng WANG Zuo-ying 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2006年第4期425-430,共6页
In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose... In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0%relative reduction in frame error rate and a 98.5%relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and frame-detection performance is clearly improved,which proves the effectiveness of the visual modal in VAD. 展开更多
关键词 speech recognition voice activity detection MULTIMODAL
原文传递
Novel DTD and VAD assisted voice detection algorithm for VoIP systems
8
作者 Ming Meng Wang Ke Ji Hong 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2016年第4期9-16,76,共9页
Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double tal... Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection(DTD) and voice activity detection(VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved. 展开更多
关键词 echo cancellation double talk detection(DTD) voice activity detection(vad adaptive filter
原文传递
基于复高斯混合模型的鲁棒VAD算法 被引量:2
9
作者 雷建军 杨震 +1 位作者 刘刚 郭军 《天津大学学报》 EI CAS CSCD 北大核心 2009年第4期353-356,共4页
针对语音激活检测的鲁棒性问题,提出在非平稳噪声环境下使用基于复高斯混合模型的鲁棒语音激活检测算法.算法中假设纯净语音谱满足复高斯混合模型,先验信噪比利用预先训练好的复高斯混合模型计算得到.复高斯混合模型的引入一方面提高了... 针对语音激活检测的鲁棒性问题,提出在非平稳噪声环境下使用基于复高斯混合模型的鲁棒语音激活检测算法.算法中假设纯净语音谱满足复高斯混合模型,先验信噪比利用预先训练好的复高斯混合模型计算得到.复高斯混合模型的引入一方面提高了语音激活检测的性能,另一方面避免了使用基于最小均方误差语音增强的先验信噪比估计过程.实验中使用NOISEX-92噪声库来验证系统在噪声环境下的性能.结果表明,该种算法在非平稳噪声环境下具有良好的检测性能. 展开更多
关键词 复高斯混合模型 语音激活检测 似然比测试
下载PDF
语音业务中鲁棒性VAD算法分析 被引量:9
10
作者 郭莉 殷南 王炳锡 《电声技术》 2005年第9期41-45,共5页
采用话音激活检测(VoicedActivityDetection,VAD)技术的目的是检测语音通信时是否有话音存在,检测到静音时加以抑制,使其不占用或极少占用信道带宽,检测到话音时才对其进行压缩编码与传输。鲁棒性语音识别系统、数字移动通信和因特网实... 采用话音激活检测(VoicedActivityDetection,VAD)技术的目的是检测语音通信时是否有话音存在,检测到静音时加以抑制,使其不占用或极少占用信道带宽,检测到话音时才对其进行压缩编码与传输。鲁棒性语音识别系统、数字移动通信和因特网实时语音传输等领域要求在恶劣声学环境条件下进行VAD检测,以节省带宽并抑制噪声,因此VAD技术是目前语音处理领域的重要问题。文中给出的几种最新VAD算法(EZCR-VAD,STAT-VAD和E-VAD)是在低信噪比环境下的话音检测具有很好的鲁棒性的算法。 展开更多
关键词 话音激活检测 信噪比 过零率 信息熵
下载PDF
采用子带长时信号变化特征的VAD检测 被引量:2
11
作者 蔡铁 唐飞 龙志军 《电视技术》 北大核心 2014年第19期228-232,共5页
为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法。将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模... 为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法。将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决。实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性。应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率。 展开更多
关键词 语音信号处理 语音活动检测 长时信号变化 子带 语音识别
下载PDF
基于FreeSWITCH的智能语音外呼系统的设计与优化
12
作者 郝锐朋 周军 +1 位作者 白兴 肖素杰 《微电子学与计算机》 2024年第7期110-118,共9页
FreeSWITCH作为目前主流的电话软交换平台,是呼叫中心的重要组成部分,实现了网络客户端、模拟电话、手机之间的互相拨号通话。基于FreeSWITCH软交换平台,设计了外呼会话流程控制方案,主要实现语音识别、语音合成、自然语言处理之间的流... FreeSWITCH作为目前主流的电话软交换平台,是呼叫中心的重要组成部分,实现了网络客户端、模拟电话、手机之间的互相拨号通话。基于FreeSWITCH软交换平台,设计了外呼会话流程控制方案,主要实现语音识别、语音合成、自然语言处理之间的流程控制,支持语音智能打断播报、按键检测、通话状态检测、转人工等功能,涵盖了全面的人机对话交互流程事件应答;改进了Unimrcp自有的语音端点检测方法,提高了有效音检测的准确性;通过Unimrcp架构集成了语音识别和语音合成能力交互逻辑,提升用户体验;同时,针对FreeSWITCH智能外呼语音打断功能进行优化,有效的解决了网络异常情况下,外呼交互过程中出现中断的问题。 展开更多
关键词 FreeSWITCH MRCP协议 智能外呼 语音端点检测 语音识别
下载PDF
利用语音VAD和DTX增强Abis接口传输能力的可能性探讨 被引量:1
13
作者 傅永根 陈慧剑 《南京邮电学院学报(自然科学版)》 2003年第1期38-42,共5页
提出了一种提高目前GSM系统中Abis接口线路传输能力的方法———利用语音通信的VAD和DTX进行话音的倍增复用,并对其原理、实现方法、传输性能和影响进行了较为深入的探讨。
关键词 移动通信 ABIS接口 语音活性检测 不连续传输 vad DTX GSM
下载PDF
指数函数规整群时延的VAD特征研究 被引量:1
14
作者 王金芳 虢明 《吉林大学学报(工学版)》 EI CAS CSCD 北大核心 2013年第S1期435-439,共5页
虽然群时延函数的噪声鲁棒性已得到证明,但谐振引起的尖峰效应严重影响进一步的实际应用。为了保证对原声学空间的表征效力,在减少特征提取过程信息丢失的前提下,以降低群时延谱的动态范围为目标,提出指数函数规整群时延的特征参数。语... 虽然群时延函数的噪声鲁棒性已得到证明,但谐振引起的尖峰效应严重影响进一步的实际应用。为了保证对原声学空间的表征效力,在减少特征提取过程信息丢失的前提下,以降低群时延谱的动态范围为目标,提出指数函数规整群时延的特征参数。语音活动检测测试实验表明,其噪声鲁棒性和检测准确度相对于现有群时延函数特征有明显提高。 展开更多
关键词 语音信号处理 语音活动检测 指数函数规整 群时延函数
下载PDF
多模型融合的VoxSRC22说话人日志系统
15
作者 杜雨轩 周若华 《计算机工程与应用》 CSCD 北大核心 2024年第10期164-172,共9页
为有效解决“谁在什么时候说话”的问题,提出一种说话人日志方法。该方法由六个模块组成,包括语音活动检测(voice activity detection,VAD)、语音增强、说话人嵌入提取器、说话人聚类、重叠语音检测(overlapping speech detection,OSD)... 为有效解决“谁在什么时候说话”的问题,提出一种说话人日志方法。该方法由六个模块组成,包括语音活动检测(voice activity detection,VAD)、语音增强、说话人嵌入提取器、说话人聚类、重叠语音检测(overlapping speech detection,OSD)和结果融合。利用语音增强技术可以改善语音活动检测的性能。有效地结合不同的说话人嵌入提取器和聚类算法可以进一步降低系统错误率。在系统融合后处理重叠语音展示了最佳结果。实验结果表明,最佳系统的性能相对基线提升了72%,并在VoxCeleb说话人识别挑战赛(VoxCeleb speaker recognition challenge,VoxSRC)2022评估集上分别实现了5.48%的说话人日志错误率(diarization error rate,DER)和32.10%的杰卡德错误率(Jaccard error rate,JER),排名第四。 展开更多
关键词 说话人日志 语音活动检测 声纹嵌入 说话人聚类 结果融合
下载PDF
一种自适应建模的VAD方法 被引量:1
16
作者 腾潇琦 冯祥 张翼飞 《计算机技术与发展》 2016年第9期26-29,共4页
语音活动检测(Voice Activity Detection,VAD)是语音前端特征处理的一个重要环节,它直接影响到后续处理的效果和效率。主流的模型VAD对训练数据的依赖度过高,在不同场景下需要重新训练不同的模型,这带来的数据标注的工作量是非常惊人的... 语音活动检测(Voice Activity Detection,VAD)是语音前端特征处理的一个重要环节,它直接影响到后续处理的效果和效率。主流的模型VAD对训练数据的依赖度过高,在不同场景下需要重新训练不同的模型,这带来的数据标注的工作量是非常惊人的。一种自适应建模的VAD方法结合了能量VAD和模型VAD的优点,成功地解决了这个问题。它对每一条语音在线地训练出语音和非语音模型,根据每一帧在模型上的似然度得分给它们打上标签,经过平滑后就可以很好地找到语音的起点和终点。实验结果表明,该方法取得了很好的效果,F_1指标相比传统能量VAD提升了0.031,说话人分离错误率下降了0.45%。 展开更多
关键词 语音活动检测 能量vad 模型vad 自适应建模
下载PDF
低信噪比环境下的多通道语音端点检测算法 被引量:1
17
作者 肖思 龚杰 李宝清 《中国科学院大学学报(中英文)》 CAS CSCD 北大核心 2023年第5期687-693,共7页
传统的端点检测算法仅利用信号的时频信息,在低信噪比环境下,尤其是非平稳噪声环境,会出现准确率下降的问题,而多通道语音信号具有丰富的空间信息,可以对时频域的信息进行补充,从而提高检测的准确率。因此在多通道空间特征研究的基础上... 传统的端点检测算法仅利用信号的时频信息,在低信噪比环境下,尤其是非平稳噪声环境,会出现准确率下降的问题,而多通道语音信号具有丰富的空间信息,可以对时频域的信息进行补充,从而提高检测的准确率。因此在多通道空间特征研究的基础上,利用接收阵列信号的协方差矩阵,提出一种全新的基于多通道协方差矩阵最大特征值的多通道语音端点检测算法。首先通过提取每一帧信号的协方差矩阵的最大特征值作为端点检测的特征参数,从而对语音信号进行跟踪,然后采用双门限阈值法判断当前帧是否为语音帧。实验结果表明,在VCTK及实验室语料库上,与梅尔能量比及新能零熵算法相比,所提出的算法具有更高的检测准确率,并且对于-5 dB的低信噪比环境及非平稳噪声环境具有更好的鲁棒性。 展开更多
关键词 语音端点检测 麦克风阵列 协方差矩阵 低信噪比
下载PDF
Enhancing Parkinson's disease severity assessment through voice-based wavelet scattering,optimized model selection,and weighted majority voting 被引量:1
18
作者 Farhad Abedinzadeh Torghabeh Seyyed Abed Hosseini Elham Ahmadi Moghadam 《Medicine in Novel Technology and Devices》 2023年第4期51-63,共13页
Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicato... Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicators of PD,making voice analysis a valuable tool for early detection and intervention.This study aims to assess and detect the severity of PD through voice analysis using the mobile device voice recordings dataset.The dataset consisted of recordings from PD patients at different stages of the disease and healthy control subjects.A novel approach was employed,incorporating a voice activity detection algorithm for speech segmentation and the wavelet scattering transform for feature extraction.A Bayesian optimization technique is used to fine-tune the hyperparameters of seven commonly used classifiers and optimize the performance of machine learning classifiers for PD severity detection.AdaBoost and K-nearest neighbor consistently demonstrated superior performance across various evaluation metrics among the classifiers.Furthermore,a weighted majority voting(WMV)technique is implemented,leveraging the predictions of multiple models to achieve a near-perfect accuracy of 98.62%,improving classification accuracy.The results highlight the promising potential of voice analysis in PD diagnosis and monitoring.Integrating advanced signal processing techniques and machine learning models provides reliable and accessible tools for PD assessment,facilitating early intervention and improving patient outcomes.This study contributes to the field by demonstrating the effectiveness of the proposed methodology and the significant role of WMV in enhancing classification accuracy for PD severity detection. 展开更多
关键词 Parkinson's disease Speech impairment voice activity detection Model selection Bayesian optimization Weighted majority voting
原文传递
连续汉语语音的自动切分研究
19
作者 李琦 张二华 《计算机与数字工程》 2023年第4期959-964,共6页
连续汉语语音的自动切分是语音识别的基础,准确的连续语音切分方法可以代替人工标记汉字音节。传统的连续汉语语音自动切分技术如双门限端点检测、基于倒谱的端点检测等方法的效果都难以满足语音识别的需要。论文在时间域、频域及倒谱... 连续汉语语音的自动切分是语音识别的基础,准确的连续语音切分方法可以代替人工标记汉字音节。传统的连续汉语语音自动切分技术如双门限端点检测、基于倒谱的端点检测等方法的效果都难以满足语音识别的需要。论文在时间域、频域及倒谱域等多个层次对连续语音信号进行分析,结合端点检测技术、频谱分析和倒等方法对音节切分点进行检测,研究了一种连续语音多级切分方法。相比传统的基于双门限和倒谱的端点检测方法,该方法将单字切分的正确率达到了92.8%。 展开更多
关键词 语音切分 端点检测 语谱图 双门限法 频带能量
下载PDF
RTP流音频回放技术在400 MHz数字列调系统中的应用
20
作者 赵文杰 《铁路通信信号工程技术》 2023年第12期43-46,共4页
介绍DRTD系统中无线列调语音业务的音频回放技术。DRTD系统有线通信基于SIP协议和R T P流进行音频传输,通过混音、加窗语音检测、缓存、格式转换、信令控制等处理流程,将音频流在无线空口上进行传输,并最终在移动终端上实现语音波形回放... 介绍DRTD系统中无线列调语音业务的音频回放技术。DRTD系统有线通信基于SIP协议和R T P流进行音频传输,通过混音、加窗语音检测、缓存、格式转换、信令控制等处理流程,将音频流在无线空口上进行传输,并最终在移动终端上实现语音波形回放,从而桥接无线列调中的有线通信和无线通信,为DRTD系统的核心业务提供支撑。 展开更多
关键词 400 MHz数字列调系统 实时传输协议 音频回放 语音活动性检测
下载PDF
上一页 1 2 12 下一页 到第
使用帮助 返回顶部