期刊文献+
共找到22篇文章
< 1 2 >
每页显示 20 50 100
AN ANALYSIS OF ACOUSTIC CHARACTERISTICS OFCLEFT PALATE SPEECH WITH COMPUTERIZED SPEECH SIGNAL PROCESSING SYSTEM 被引量:1
1
作者 李锦峰 刘建华 《Journal of Pharmaceutical Analysis》 CAS 1996年第2期162-165,共4页
The acoustic characteristics or the chinese vowels of 24 children with cleft palate and 10 normal control children were analyzed by computerized speech signal processing system (CSSPS),and the speech articulation was ... The acoustic characteristics or the chinese vowels of 24 children with cleft palate and 10 normal control children were analyzed by computerized speech signal processing system (CSSPS),and the speech articulation was judged with Glossary of clert palate speech(GCPS).The listening judgement showed that the speech articulation was significantly different between the two groups(P<0.01).The objective quantitative measurement suggested that the formant pattern(FP)of vowels in children with cleft palate was different from that of normal control children except vowel[a](P< 0.05).The acoustic vowelgraph or the Chinese vowels which demonstrated directly the relationship of vocal space and speech perception was stated with the first formant frequence(F1)and the second formant frequence(F2).The authors conclude that the values or F1 and F2 point out the upward and backward tongue movement to close the clert, which reflects the vocal characteristics of trausmission of clert palate speech. 展开更多
关键词 cleft palate speech the Chinese vowels the formant pattern the speech articulation computerized speech singnal processing system
下载PDF
Audio-visual keyword transformer for unconstrained sentence-level keyword spotting
2
作者 Yidi Li Jiale Ren +3 位作者 Yawei Wang Guoquan Wang Xia Li Hong Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第1期142-152,共11页
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-... As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging problem.To this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video clips.The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs.The outputs of audio and visual branches are combined in a decision fusion module.As humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword.Moreover,the position of the keyword is localised in the attention map without additional position labels.Exper-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy conditions.The code is available at https://github.com/jialeren/AVKT. 展开更多
关键词 artificial intelligence multimodal approaches natural language processing neural network speech processing
下载PDF
Enhanced Frequency-Domain Frost Algorithm Using Conjugate Gradient Techniques for Speech Enhancement 被引量:1
3
作者 Shengkui Zhao Douglas L. Jones 《Journal of Electronic Science and Technology》 CAS 2012年第2期158-162,共5页
In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless ... In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless response (MVDR) solution with the correlation matrix inversion, the Frost algorithm implementing the stochastic constrained least mean square (LMS) algorithm can adaptively converge to the MVDR solution in mean-square sense, but with a very slow convergence rate. In this paper, we propose a frequency-domain constrained conjugate gradient (FDCCG) algorithm to speed up the convergence. The devised FDCCG algorithm avoids the matrix inversion and exhibits fast convergence. The speech enhancement experiments for the target speech signal corrupted by two and five interfering speech signals are demonstrated by using a four-channel acoustic-vector-sensor (AVS) micro-phone array and show the superior performance. 展开更多
关键词 Adaptive gence correlation speech arrays. signal processing conver- enhancement MICROPHONE
下载PDF
Speech Encryption with Fractional Watermark
4
作者 Yan Sun Cun Zhu Qi Cui 《Computers, Materials & Continua》 SCIE EI 2022年第10期1817-1825,共9页
Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non... Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non-stationary characteristics of the signal.When short-term stable speech is obtained by windowing and framing the subsequent processing of the signal is completed by the Discrete Fourier Transform(DFT).The Fast Discrete Fourier Transform is a commonly used analysis method for speech and image signal processing in frequency domain.It has the problem of adjusting window size to a for desired resolution.But the Fractional Fourier Transform can have both time domain and frequency domain processing capabilities.This paper performs global processing speech encryption by combining speech with image of Fractional Fourier Transform.The speech signal is embedded watermark image that is processed by fractional transformation,and the embedded watermark has the effect of rotation and superposition,which improves the security of the speech.The paper results show that the proposed speech encryption method has a higher security level by Fractional Fourier Transform.The technology is easy to extend to practical applications. 展开更多
关键词 Fractional Fourier Transform WATERMARK speech signal processing image processing
下载PDF
Speech-Music-Noise Discrimination in Sound Indexing of Multimedia Documents
5
作者 Lamia Bouafif Noureddine Ellouze 《Sound & Vibration》 2018年第6期2-10,共9页
Sound indexing and segmentation of digital documentsespecially in the internet and digital libraries are very useful tosimplify and to accelerate the multimedia document retrieval. Wecan imagine that we can extract mu... Sound indexing and segmentation of digital documentsespecially in the internet and digital libraries are very useful tosimplify and to accelerate the multimedia document retrieval. Wecan imagine that we can extract multimedia files not only bykeywords but also by speech semantic contents. The maindifficulty of this operation is the parameterization and modellingof the sound track and the discrimination of the speech, musicand noise segments. In this paper, we will present aSpeech/Music/Noise indexing interface designed for audiodiscrimination in multimedia documents. The program uses astatistical method based on ANN and HMM classifiers. After preemphasisand segmentation, the audio segments are analysed bythe cepstral acoustic analysis method. The developed system wasevaluated on a database constituted of music songs with Arabicspeech segments under several noisy environments. 展开更多
关键词 speech processing audio indexing training andrecognition
下载PDF
Enhancing Parkinson’s Disease Diagnosis Accuracy Through Speech Signal Algorithm Modeling
6
作者 Omar M.El-Habbak Abdelrahman M.Abdelalim +5 位作者 Nour H.Mohamed Habiba M.Abd-Elaty Mostafa A.Hammouda Yasmeen Y.Mohamed Mohanad A.Taifor Ali W.Mohamed 《Computers, Materials & Continua》 SCIE EI 2022年第2期2953-2969,共17页
Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obs... Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obstacle in early-stage detection and diagnosis for clinical professionals in the medical field.To solve this issue,the study proposes using machine learning and deep learning models to analyze processed speech signals of patients’voice recordings.Datasets of these processed speech signals were obtained and experimented on by random forest and logistic regression classifiers.Results were highly successful,with 90%accuracy produced by the random forest classifier and 81.5%by the logistic regression classifier.Furthermore,a deep neural network was implemented to investigate if such variation in method could add to the findings.It proved to be effective,as the neural network yielded an accuracy of nearly 92%.Such results suggest that it is possible to accurately diagnose early-stage PD through merely testing patients’voices.This research calls for a revolutionary diagnostic approach in decision support systems,and is the first step in a market-wide implementation of healthcare software dedicated to the aid of clinicians in early diagnosis of PD. 展开更多
关键词 Early diagnosis logistic regression neural network Parkinson’s disease random forest speech signal processing algorithms
下载PDF
BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION
7
作者 Huang Yulei Ding Zhizhong +1 位作者 Dai Lirong Chen Xiaoping 《Journal of Electronics(China)》 2012年第3期286-293,共8页
Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation... Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches. 展开更多
关键词 Blind Source Separation (BSS) Blind deconvolution speech signal processing Human-machine interaction Simultaneous diagonalization
下载PDF
Analysis of Deaf Speakers’ Speech Signal for Understanding the Acoustic Characteristics by Territory Specific Utterances
8
作者 Nirmaladevi Jaganathan Bommannaraja Kanagaraj 《Circuits and Systems》 2016年第8期1709-1721,共13页
An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate s... An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate such deaf speakers through certain assistive mechanism, an effort has been taken to understand the acoustic characteristics of deaf speakers by evaluating the territory specific utterances. Speech signals are acquired from 32 normal and 32 deaf speakers by uttering ten Indian native Tamil language words. The speech parameters like pitch, formants, signal-to-noise ratio, energy, intensity, jitter and shimmer are analyzed. From the results, it has been observed that the acoustic characteristics of deaf speakers differ significantly and their quantitative measure dominates the normal speakers for the words considered. The study also reveals that the informative part of speech in a normal and deaf speakers may be identified using the acoustic features. In addition, these attributes may be used for differential corrections of deaf speaker’s speech signal and facilitate listeners to understand the conveyed information. 展开更多
关键词 Deaf Speaker Hard of Hearing Deaf speech processing Assistive Mechanism for Deaf Speaker speech Correction speech Signal processing
下载PDF
On‐device audio‐visual multi‐person wake word spotting
9
作者 Yidi Li Guoquan Wang +2 位作者 Zhan Chen Hao Tang Hong Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第4期1578-1589,共12页
Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance.However,most audio‐vi... Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance.However,most audio‐visual wake word spotting models are only suitable for simple single‐speaker scenarios and require high computational complexity.Further development is hindered by complex multi‐person scenarios and computational limitations in mobile environments.In this paper,a novel audio‐visual model is proposed for on‐device multi‐person wake word spotting.Firstly,an attention‐based audio‐visual voice activity detection module is presented,which generates an attention score matrix of audio and visual representations to derive active speaker representation.Secondly,the knowledge distillation method is introduced to transfer knowledge from the large model to the on‐device model to control the size of our model.Moreover,a new audio‐visual dataset,PKU‐KWS,is collected for sentence‐level multi‐person wake word spotting.Experimental results on the PKU‐KWS dataset show that this approach outperforms the previous state‐of‐the‐art methods. 展开更多
关键词 audio‐visual fusion human‐computer interfacing speech processing
下载PDF
Multisource localization based on angle distribution of time-frequency points using an FOA microphone
10
作者 Liang Tao Maoshen Jia +2 位作者 Lu Li Jing Wang Yang Xiang 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第3期807-823,共17页
Multisource localization occupies an important position in the field of acoustic signal processing and is widely applied in scenarios,such as human‐machine interaction and spatial acoustic parameter acquisition.The d... Multisource localization occupies an important position in the field of acoustic signal processing and is widely applied in scenarios,such as human‐machine interaction and spatial acoustic parameter acquisition.The direction‐of‐arrival(DOA)of a sound source is convenient to render spatial sound in the audio metaverse.A multisource localization method in a reverberation environment is proposed based on the angle distribution of time-frequency(TF)points using a first‐order ambisonics(FOA)microphone.The method is implemented in three steps.1)By exploring the angle distribution of TF points,a single‐source zone(SSZ)detection method is proposed by using a standard deviation‐based measure,which reveals the degree of convergence of TF point angles in a zone.2)To reduce the effect of outliers on localization,an outlier removal method is designed to remove the TF points whose angles are far from the real DOAs,where the median angle of each detected zone is adopted to construct the outlier set.3)DOA estimates of multiple sources are obtained by postprocessing of the angle histogram.Experimental results in both the simulated and real scenarios verify the effectiveness of the proposed method in a reverberation environment,which also show that the proposed method outperforms reference methods. 展开更多
关键词 signal processing speech processing
下载PDF
Recent Progresses in Deep Learning Based Acoustic Models 被引量:9
11
作者 Dong Yu Jinyu Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第3期396-409,共14页
In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) a... In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) and convolutional neural networks(CNNs) that can effectively exploit variablelength contextual information,and their various combination with other models.We then describe models that are optimized end-to-end and emphasize on feature representations learned jointly with the rest of the system,the connectionist temporal classification(CTC) criterion,and the attention-based sequenceto-sequence translation model.We further illustrate robustness issues in speech recognition systems,and discuss acoustic model adaptation,speech enhancement and separation,and robust training strategies.We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research. 展开更多
关键词 Attention model convolutional neural network(CNN) connectionist temporal classification(CTC) deep learning(DL) long short-term memory(LSTM) permutation invariant training speech adaptation speech processing speech recognition speech separation
下载PDF
An enhanced relative spectral processing of speech 被引量:2
12
作者 ZHEN Bin WU Xihong LIU Zhimin CHI Huisheng (Center for Information Science, Peking University Beijing 100871) 《Chinese Journal of Acoustics》 2002年第1期86-96,共11页
An enhanced relative spectral (FLRASTA) technique for speech and speaker recognition is proposed. The new method consists of classical RASTA filtering in logarithmic spectral domain following by another additive RASTA... An enhanced relative spectral (FLRASTA) technique for speech and speaker recognition is proposed. The new method consists of classical RASTA filtering in logarithmic spectral domain following by another additive RASTA filtering in the same domain. In this manner, both the channel distortion and additive noise are removed effectively. In speaker identification and speech recognition experiments on T146 database, the E_RASTA performs equal or better than J_RASTA method in both tasks. The E_RASTA does not need the speech SNR estimation in order to determinate the optimal value of J in J_RASTA, and the information of how the speech degrades. The choice of ERASTA filter also indicates that the low temporal modulation components in speech can deteriorate the performance of both recognition tasks. Besides, the speaker recognition needs less temporal modulation frequency band than that of the speech recognition. 展开更多
关键词 An enhanced relative spectral processing of speech MFCC
原文传递
The laboratory of acoustics,speech and signal processing at the institute of acoustics 被引量:1
13
《Chinese Journal of Acoustics》 1990年第4期372-374,共3页
The Laboratory of Acoustics,Speech and Signal Processing(LASSP),theunique and superior national key laboratory of ASSP in China,has been foundedat the Inst.of Acoustics,Academia Sinica,Beijing PRC.After three years of... The Laboratory of Acoustics,Speech and Signal Processing(LASSP),theunique and superior national key laboratory of ASSP in China,has been foundedat the Inst.of Acoustics,Academia Sinica,Beijing PRC.After three years ofefforts,the construction of the LASSP has been completed successfully and thecertain capability of performing frontier research projects in fundamental theory andapplied technology of sound field and acoustic signal processing has ben formed.A fiexible and complete experimental acoustic signal processing system hasbeen set up in the LASSP.With the remarkable advantage of real time signalprocessing and resource sharing,a wide range of research projects in the field ofASSP can be conducted in the laboratory.The Signal Processing Center of theLASSP is well equipped with many computer research facilities including the 展开更多
关键词 ASSP In WELL The laboratory of acoustics speech and signal processing at the institute of acoustics
原文传递
ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL
14
作者 Jian Zhihua Yang Zhen 《Journal of Electronics(China)》 2010年第1期1-7,共7页
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters fo... Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation. 展开更多
关键词 speech processing Voice conversion Non-Linear Canonical Correlation Analysis(NLCCA) Gaussian Mixture Model(GMM)
下载PDF
Data Intelligent Low Power High Performance TCAM for IP-Address Lookup Table
15
作者 K. Mathan T. Ravichandran 《Circuits and Systems》 2016年第11期3734-3745,共12页
This paper represents current research in low-power Very Large Scale Integration (VLSI) domain. Nowadays low power has become more sought research topic in electronic industry. Power dissipation is the most important ... This paper represents current research in low-power Very Large Scale Integration (VLSI) domain. Nowadays low power has become more sought research topic in electronic industry. Power dissipation is the most important area while designing the VLSI chip. Today almost all of the high speed switching devices include the Ternary Content Addressable Memory (TCAM) as one of the most important features. When a device consumes less power that becomes reliable and it would work with more efficiency. Complementary Metal Oxide Semiconductor (CMOS) technology is best known for low power consumption devices. This paper aims at designing a router application device which consumes less power and works more efficiently. Various strategies, methodologies and power management techniques for low power circuits and systems are discussed in this research. From this research the challenges could be developed that might be met while designing low power high performance circuit. This work aims at developing Data Aware AND-type match line architecture for TCAM. A TCAM macro of 256 × 128 was designed using Cadence Advanced Development Environment (ADE) with 90 nm technology file from Taiwan Semiconductor Manufacturing Company (TSMC). The result shows that the proposed Data Aware architecture provides around 35% speed and 45% power improvement over existing architecture. 展开更多
关键词 Low Power TCAM Switching Power Match Line Searchline Data Aware and speech processing
下载PDF
4th National Conference on Speech,Image,Communication,and Signal Processing,held in Beijing,25—27 October 1989
16
作者 ZHANG Jialu 《Chinese Journal of Acoustics》 1990年第2期183-183,共1页
The 4th National Conference on Speech,Image,Communication and Signal Pro-cessing,which was sponsored by the Institute of Speech,Hearing,and Music Acoustics,Acoustical Society of China and the Institute of Signal Proce... The 4th National Conference on Speech,Image,Communication and Signal Pro-cessing,which was sponsored by the Institute of Speech,Hearing,and Music Acoustics,Acoustical Society of China and the Institute of Signal Processing,Electronic Society ofChina,was held,25—27 October,1989,at Beijing Institute of Post and Telecommun-ication.The conference drew a registration of 150 from different places in the country,which made it the largest conference in the last eight years.The president of Institute of Speech,Hearing,and Music Acoustics,ASC,professorZHANG Jialu made a openning speech at the openning session,and the honorary presi-dent of Acoustical Society of China,professor MAA Dah-You and the president of 展开更多
关键词 October 1989 National Conference on speech Image Communication and Signal processing held in Beijing 25
原文传递
CELP-Based Implementation of the GSM Half-Rate Speech Codes
17
作者 ZhangHaiyan ZhouYuechen 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 1998年第2期72-75,共4页
This paper presents the real-time implementation of 6.75kb/s speech codec for the GSM half-rate digital cellular system based on CELP[1]. Logarithmic Area Ratio (LAN).[2] quanrizarion for short term Parameters and e... This paper presents the real-time implementation of 6.75kb/s speech codec for the GSM half-rate digital cellular system based on CELP[1]. Logarithmic Area Ratio (LAN).[2] quanrizarion for short term Parameters and eeeicient adaptive codebook search are used. An overlapping center-clipping codebook and the fonnufor for fast searching are proposed. The MOS of the synthesized speech is over 3.5. 展开更多
关键词 mobile communication speech processing predictive technology
原文传递
LSB steganalysis of speech data based on distance measure and ML decision
18
作者 DENG Zong-yuan SHAO Xi YANG Zhen 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2007年第3期103-107,共5页
Steganalysis can be used to classify an object whether or not it contains hidden information. In this article, is presented, a novel approach to detect the presence of least significant bit(LSB) steganographic messa... Steganalysis can be used to classify an object whether or not it contains hidden information. In this article, is presented, a novel approach to detect the presence of least significant bit(LSB) steganographic messages in the voice secure communication system. A distance measure, which has proven to be sensitive to LSB steganography by analysis of variance (ANOVA), is denoted to estimate the difference between the host signal and the stego signal. Then an maximum likelihood (ML) decision is combined to form the classifier. Statistical experiments show that the proposed approach has a highly accurate rate and low computational complexity. 展开更多
关键词 speech signal processing LSB steganography STEGANALYSIS ML decision
原文传递
Aberrant auditory system and its developmental implications for autism 被引量:4
19
作者 Luodi Yu Suiping Wang 《Science China(Life Sciences)》 SCIE CAS CSCD 2021年第6期861-878,共18页
Most infants who are later diagnosed with autism show delayed speech and language and/or atypical language profile.There is a large body of research on abnormal speech and language in children with autism.However,audi... Most infants who are later diagnosed with autism show delayed speech and language and/or atypical language profile.There is a large body of research on abnormal speech and language in children with autism.However,auditory development has been relatively under-investigated in autism research,despite its inextricable relationship with language development and despite researchers'ability to detect abnormalities in brain development and behavior in early infancy.In this review,we synthesize research on auditory processing in the prenatal period through infancy and childhood in typically developing children,children at high risk for autism,and children diagnosed with autism.We conclude that there are clear neurobiological and behavioral links between abnormal auditory development and the deficits in social communication seen in autism.We then offer perspectives on the need for a systematic characterization of early auditory development in autism,and identified questions to be addressed in future research on the development of autism. 展开更多
关键词 AUTISM auditory development auditory system early brain development language development infant learning speech processing
原文传递
One-Step interpolation Predictive Vector Quantization of LSP Parameters 被引量:1
20
作者 Bao Changchun(Department of Telecommunication Engineering, Changchun Posts and Telecommunications institUte,Changchun 130012,P.R.China)Dai Yisong(Department of Electronic Engineering,Jinn University of Technology,Changchun 130022,P.R.China) 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 1996年第1期21-26,共6页
A new spectral coding approach,one-step interpolation predictive vectorquantimtion of the LineSPectrum Pair(LSP)coding,is proposed. With the use of the sufficient correlation of successive LSP vectorsand because of re... A new spectral coding approach,one-step interpolation predictive vectorquantimtion of the LineSPectrum Pair(LSP)coding,is proposed. With the use of the sufficient correlation of successive LSP vectorsand because of relatively slow variation of the short term spectrum of the speech waveform, the idea ofpredictive vector quantization and DPCM techniques are used to reduce variance Of the parameters to be quantics.18 bit/home split vector quantization scheme is designed to quantize the prediction residual.The average spectral distortion of 1.178 dB can be achieved when frame period is 30ms. 展开更多
关键词 s:speech processing line spectrum Pair vector quantization
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部