Voice is an important topic in academic writing research.Despite increasing research interest,voice has remained a contentious topic.In the article:On the Use of the Passive and Active Voice in Astrophysics Journal Pa...Voice is an important topic in academic writing research.Despite increasing research interest,voice has remained a contentious topic.In the article:On the Use of the Passive and Active Voice in Astrophysics Journal Papers:With Extensions to other Languages and other Fields,an interesting hypothesis has been put forward:“Writers of methods sections in this field tend to use we+active when they have made a unique procedural choice or have introduced some technical innovation.On the other hand,the passive seems to be used when authors are simply following established or standard procedures.”Based on the existing research,this paper chooses to study the active and passive forms in the methods section of research articles,and we examine the frequency of the active and passive verb forms in the methods section of two linguistics journal articles.And to verify an old hypothesis:The we+active is used for new stuff,while the passive is preferred for old stuff,this paper will examine the we+active forms and passive verb forms the in methods section in two linguistic journal articles.展开更多
This thesis is about the skills of practical writing for Tour Guide.In this article,the differences between English and Chinese are discussed.As a Tour Guide,one needs to grasp these characteristics of the two languag...This thesis is about the skills of practical writing for Tour Guide.In this article,the differences between English and Chinese are discussed.As a Tour Guide,one needs to grasp these characteristics of the two languages,and has ability of describing sceneries beautifully and accurately by writing.展开更多
Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicato...Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicators of PD,making voice analysis a valuable tool for early detection and intervention.This study aims to assess and detect the severity of PD through voice analysis using the mobile device voice recordings dataset.The dataset consisted of recordings from PD patients at different stages of the disease and healthy control subjects.A novel approach was employed,incorporating a voice activity detection algorithm for speech segmentation and the wavelet scattering transform for feature extraction.A Bayesian optimization technique is used to fine-tune the hyperparameters of seven commonly used classifiers and optimize the performance of machine learning classifiers for PD severity detection.AdaBoost and K-nearest neighbor consistently demonstrated superior performance across various evaluation metrics among the classifiers.Furthermore,a weighted majority voting(WMV)technique is implemented,leveraging the predictions of multiple models to achieve a near-perfect accuracy of 98.62%,improving classification accuracy.The results highlight the promising potential of voice analysis in PD diagnosis and monitoring.Integrating advanced signal processing techniques and machine learning models provides reliable and accessible tools for PD assessment,facilitating early intervention and improving patient outcomes.This study contributes to the field by demonstrating the effectiveness of the proposed methodology and the significant role of WMV in enhancing classification accuracy for PD severity detection.展开更多
In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose...In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0%relative reduction in frame error rate and a 98.5%relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and frame-detection performance is clearly improved,which proves the effectiveness of the visual modal in VAD.展开更多
Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk d...Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection(DTD) and voice activity detection(VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved.展开更多
Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the inte...Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the interfering signal, is attracting attention of researchers. However, the interference suppression of GSC is limited since a little residual desired signal leaks into the reference signal. To overcome this problem, we use sparse coding to suppress the residual desired signal while preserving the reference signal. Sparse coding with the learned dictionary is usually used to reconstruct the desired signal. As the training samples of a desired signal for dictionary learning are not observable in the real environment, the reconstructed desired signal may contain a lot of residual interfering signal. In contrast,the training samples of the interfering signal during the absence of the desired signal for interferer dictionary learning can be achieved through voice activity detection(VAD). Since the reference signal of an interfering signal is coherent to the interferer dictionary, it can be well restructured by sparse coding, while the residual desired signal will be removed. The performance of GSC will be improved since the estimate of the interfering signal with the proposed reference signal is more accurate than ever. Simulation and experiments on a real acoustic environment show that our proposed method is effective in suppressing interfering signals.展开更多
文摘Voice is an important topic in academic writing research.Despite increasing research interest,voice has remained a contentious topic.In the article:On the Use of the Passive and Active Voice in Astrophysics Journal Papers:With Extensions to other Languages and other Fields,an interesting hypothesis has been put forward:“Writers of methods sections in this field tend to use we+active when they have made a unique procedural choice or have introduced some technical innovation.On the other hand,the passive seems to be used when authors are simply following established or standard procedures.”Based on the existing research,this paper chooses to study the active and passive forms in the methods section of research articles,and we examine the frequency of the active and passive verb forms in the methods section of two linguistics journal articles.And to verify an old hypothesis:The we+active is used for new stuff,while the passive is preferred for old stuff,this paper will examine the we+active forms and passive verb forms the in methods section in two linguistic journal articles.
文摘This thesis is about the skills of practical writing for Tour Guide.In this article,the differences between English and Chinese are discussed.As a Tour Guide,one needs to grasp these characteristics of the two languages,and has ability of describing sceneries beautifully and accurately by writing.
文摘Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicators of PD,making voice analysis a valuable tool for early detection and intervention.This study aims to assess and detect the severity of PD through voice analysis using the mobile device voice recordings dataset.The dataset consisted of recordings from PD patients at different stages of the disease and healthy control subjects.A novel approach was employed,incorporating a voice activity detection algorithm for speech segmentation and the wavelet scattering transform for feature extraction.A Bayesian optimization technique is used to fine-tune the hyperparameters of seven commonly used classifiers and optimize the performance of machine learning classifiers for PD severity detection.AdaBoost and K-nearest neighbor consistently demonstrated superior performance across various evaluation metrics among the classifiers.Furthermore,a weighted majority voting(WMV)technique is implemented,leveraging the predictions of multiple models to achieve a near-perfect accuracy of 98.62%,improving classification accuracy.The results highlight the promising potential of voice analysis in PD diagnosis and monitoring.Integrating advanced signal processing techniques and machine learning models provides reliable and accessible tools for PD assessment,facilitating early intervention and improving patient outcomes.This study contributes to the field by demonstrating the effectiveness of the proposed methodology and the significant role of WMV in enhancing classification accuracy for PD severity detection.
文摘In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0%relative reduction in frame error rate and a 98.5%relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and frame-detection performance is clearly improved,which proves the effectiveness of the visual modal in VAD.
基金supported by the National Youth Science Fund Project(61501052)the National Natural Science Foundation of China(61271182)
文摘Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection(DTD) and voice activity detection(VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved.
基金Project supported by the National Basic Research Program(973)of China(No.2012CB316400)the National NaturalScience Foundation of China(No.61171151)
文摘Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the interfering signal, is attracting attention of researchers. However, the interference suppression of GSC is limited since a little residual desired signal leaks into the reference signal. To overcome this problem, we use sparse coding to suppress the residual desired signal while preserving the reference signal. Sparse coding with the learned dictionary is usually used to reconstruct the desired signal. As the training samples of a desired signal for dictionary learning are not observable in the real environment, the reconstructed desired signal may contain a lot of residual interfering signal. In contrast,the training samples of the interfering signal during the absence of the desired signal for interferer dictionary learning can be achieved through voice activity detection(VAD). Since the reference signal of an interfering signal is coherent to the interferer dictionary, it can be well restructured by sparse coding, while the residual desired signal will be removed. The performance of GSC will be improved since the estimate of the interfering signal with the proposed reference signal is more accurate than ever. Simulation and experiments on a real acoustic environment show that our proposed method is effective in suppressing interfering signals.