A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec-...A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.展开更多
Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is ...Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.展开更多
A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region accor...A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region according to variance-sum criteria function of the feature vectors. Then pa-rameters of GMM are calculated by using the statistical information of these feature vectors. GMM predicts the regularization parameter for each pixel adaptively. Hopfield Neural Network (Hopfield-NN) is used to optimize the objective function of image restoration,and network weight value matrix is updated by the output of GMM. Since GMM is used,the regularization parameters share properties of different kind of regions. In addition,the regularization parameters are different from pixel to pixel. GMM-based regularization method is consistent with human visual system,and it has strong gener-alization capability. Comparing with non-adaptive and some adaptive image restoration algorithms,experimental results show that the proposed algorithm obtains more preferable restored images.展开更多
This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online ...This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online sample by sample, instead of waiting for a block of data with the sufficient size to start training as in the traditional EM procedure. The proposed method is extended from the split-and-merge EM procedure, so inherently it is also capable escaping from local maxima and reducing the chances of singularities. In the application domain, the algorithm is optimized in the context of speech processing applications. Experiments on the synthetic data show the advantage and efficiency of the new method and the results in a speech processing task also confirm the improvement of system performance.展开更多
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters fo...Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.展开更多
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ...Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.展开更多
Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex enviro...Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.展开更多
Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk...Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.展开更多
In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use in...In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use intrapersonal palmprint deformations to train the global GMM instead of modeling GMMs for every class. To reduce the dimension of such variations while preserving density function of sample space, Principle Component Analysis (PCA) is used to find the principle differences and form the Intrapersonal Deformation Subspace (IDS). After training GMM using Expectation Maximization (EM) algorithm in IDS, a maximum likelihood strategy is carried out to identify a person. Experimental results demonstrate the advantage of our method compared with traditional PCA method and single Gaussian strategy.展开更多
The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features...The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features of aircraft trajectories.Low anomaly detection accuracy still exists due to the high-dimensionality,heterogeneity and temporality of flight trajectory data.To this end,this paper proposes an abnormal trajectory detection method based on the deep mixture density network(DMDN)to detect flights with unusual data patterns and evaluate flight trajectory safety.The technique consists of two components:Utilization of the deep long short-term memory(LSTM)network to encode features of flight trajectories effectively,and parameterization of the statistical properties of flight trajectory using the Gaussian mixture model(GMM).Experiment results on Guangzhou Baiyun International Airport terminal airspace show that the proposed method can effectively capture the statistical patterns of aircraft trajectories.The model can detect abnormal flights with elevated risks and its performance is superior to two mainstream methods.The proposed model can be used as an assistant decision-making tool for air traffic controllers.展开更多
Emotion recognition from speech is an important field of research in human computer interaction. In this letter the framework of Support Vector Machines (SVM) with Gaussian Mixture Model (GMM) supervector is introduce...Emotion recognition from speech is an important field of research in human computer interaction. In this letter the framework of Support Vector Machines (SVM) with Gaussian Mixture Model (GMM) supervector is introduced for emotional speech recognition. Because of the importance of variance in reflecting the distribution of speech, the normalized mean vectors potential to exploit the information from the variance are adopted to form the GMM supervector. Comparative experiments from five aspects are conducted to study their corresponding effect to system performance. The experiment results, which indicate that the influence of number of mixtures is strong as well as influence of duration is weak, provide basis for the train set selection of Universal Background Model (UBM).展开更多
A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence...A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.展开更多
This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using A...This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible.展开更多
Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of componen...Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY considerably outperforms both VB and MML,especially in detecting the objects of interest from a confusing background.展开更多
An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking met...An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking method,requires seeking the "a priori" probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the "a posteriori" signal-to-noise ratio(SNR).To overcome this problem,first,the optimal values in terms of the perceived speech quality of a variety of noise types are derived.Second,the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model(GMM).The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type,unlike the conventional approach which uses a fixed threshold and smoothing parameter.The performance of the proposed method was evaluated by objective tests,such as the perceptual evaluation of speech quality(PESQ) and composite measure.Performance was then evaluated by a subjective test,namely,mean opinion scores(MOS) under various noise environments.The proposed method show better results than existing methods.展开更多
In this paper, an evolutionary recursive Bayesian estimation algorithm is presented, which incorporates the latest observation with a new proposal distribution, and the posterior state density is represented by a Gaus...In this paper, an evolutionary recursive Bayesian estimation algorithm is presented, which incorporates the latest observation with a new proposal distribution, and the posterior state density is represented by a Gaussian mixture model that is recovered from the weighted particle set of the measurement update step by means of a weighted expectation-maximization algorithm. This step replaces the resampling stage needed by most particle filters and relieves the effect caused by sample impoverishment. A nonlinear tracking problem shows that this new approach outperforms other related particle filters.展开更多
This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Frequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, P...This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Frequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account.Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN.展开更多
In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predic...In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predictive coding(LPC),linear prediction cepstrum coefficient(LPCC),perceptual linear prediction(PLP),and Mel frequency cepstral coefficient(MFCC).The 10-hour speech data were used for training and 3-hour data for testing.For each spectral feature,different hidden Markov model(HMM)based recognizers with variations in HMM states and different Gaussian mixture models(GMMs)were built.The performance was evaluated by using the word error rate(WER).The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.展开更多
Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal co...Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal component analysis de-nosing algorithm to reduce white noise in the video.Next,we apply the Gaussian mixture model(GMM)to model the video and segment all foreground objects primarily.After that,we calculate von Mises distribution of the velocity vectors and ratio of the overlapped region with referring to the result of the primary segmentation and extract the interesting object.Finally,rain and snow streaks are inpainted using the background to improve the quality of the video data.Experiments show that the proposed method can effectively suppress noise and extract interesting targets.展开更多
This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of t...This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing procedures.Shared memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR algorithm.The experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved.展开更多
基金Supported by the National Natural Science Foundation of China(60505004,60773061)~~
文摘A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.
基金the National Natural Science Foundation of China(61771367)the Science and Technology on Communication Networks Laboratory(HHS19641X003).
文摘Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.
文摘A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region according to variance-sum criteria function of the feature vectors. Then pa-rameters of GMM are calculated by using the statistical information of these feature vectors. GMM predicts the regularization parameter for each pixel adaptively. Hopfield Neural Network (Hopfield-NN) is used to optimize the objective function of image restoration,and network weight value matrix is updated by the output of GMM. Since GMM is used,the regularization parameters share properties of different kind of regions. In addition,the regularization parameters are different from pixel to pixel. GMM-based regularization method is consistent with human visual system,and it has strong gener-alization capability. Comparing with non-adaptive and some adaptive image restoration algorithms,experimental results show that the proposed algorithm obtains more preferable restored images.
文摘This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online sample by sample, instead of waiting for a block of data with the sufficient size to start training as in the traditional EM procedure. The proposed method is extended from the split-and-merge EM procedure, so inherently it is also capable escaping from local maxima and reducing the chances of singularities. In the application domain, the algorithm is optimized in the context of speech processing applications. Experiments on the synthetic data show the advantage and efficiency of the new method and the results in a speech processing task also confirm the improvement of system performance.
基金Supported by the National High Technology Research and Development Program of China (863 Program,No.2006AA010102)
文摘Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.
文摘Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.
基金supported by the National Natural Science Foundation of China under Grant No.62273083 and No.61973069Natural Science Foundation of Hebei Province under Grant No.F2020501012。
文摘Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.
基金Supported by the National Basic Research Program of China(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.
文摘In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use intrapersonal palmprint deformations to train the global GMM instead of modeling GMMs for every class. To reduce the dimension of such variations while preserving density function of sample space, Principle Component Analysis (PCA) is used to find the principle differences and form the Intrapersonal Deformation Subspace (IDS). After training GMM using Expectation Maximization (EM) algorithm in IDS, a maximum likelihood strategy is carried out to identify a person. Experimental results demonstrate the advantage of our method compared with traditional PCA method and single Gaussian strategy.
基金supported in part by the National Natural Science Foundation of China(Nos.62076126,52075031)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.SJCX19_0013)。
文摘The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features of aircraft trajectories.Low anomaly detection accuracy still exists due to the high-dimensionality,heterogeneity and temporality of flight trajectory data.To this end,this paper proposes an abnormal trajectory detection method based on the deep mixture density network(DMDN)to detect flights with unusual data patterns and evaluate flight trajectory safety.The technique consists of two components:Utilization of the deep long short-term memory(LSTM)network to encode features of flight trajectories effectively,and parameterization of the statistical properties of flight trajectory using the Gaussian mixture model(GMM).Experiment results on Guangzhou Baiyun International Airport terminal airspace show that the proposed method can effectively capture the statistical patterns of aircraft trajectories.The model can detect abnormal flights with elevated risks and its performance is superior to two mainstream methods.The proposed model can be used as an assistant decision-making tool for air traffic controllers.
基金Supported by the National Natural Science Foundation of China (No. 61105076)Natural Science Foundation of Anhui Province of China (No. 11040606M127) as well as Key ScientificTechnological Project of Anhui Province (No. 11010202192)
文摘Emotion recognition from speech is an important field of research in human computer interaction. In this letter the framework of Support Vector Machines (SVM) with Gaussian Mixture Model (GMM) supervector is introduced for emotional speech recognition. Because of the importance of variance in reflecting the distribution of speech, the normalized mean vectors potential to exploit the information from the variance are adopted to form the GMM supervector. Comparative experiments from five aspects are conducted to study their corresponding effect to system performance. The experiment results, which indicate that the influence of number of mixtures is strong as well as influence of duration is weak, provide basis for the train set selection of Universal Background Model (UBM).
文摘A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.
基金Supported by the National Natural Science Foundation of China (No. 60872105)the Program for Science & Technology Innovative Research Team of Qing Lan Project in Higher Educational Institutions of Jiangsuthe Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
文摘This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible.
基金The work described in this paper was supported by a grant of the General Research Fund(GRF)from the Research Grant Council of Hong Kong SAR(Project No.CUHK418011E).
文摘Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY considerably outperforms both VB and MML,especially in detecting the objects of interest from a confusing background.
基金Project supported by an Inha University Research GrantProject(10031764) supported by the Strategic Technology Development Program of Ministry of Knowledge Economy,Korea
文摘An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking method,requires seeking the "a priori" probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the "a posteriori" signal-to-noise ratio(SNR).To overcome this problem,first,the optimal values in terms of the perceived speech quality of a variety of noise types are derived.Second,the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model(GMM).The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type,unlike the conventional approach which uses a fixed threshold and smoothing parameter.The performance of the proposed method was evaluated by objective tests,such as the perceptual evaluation of speech quality(PESQ) and composite measure.Performance was then evaluated by a subjective test,namely,mean opinion scores(MOS) under various noise environments.The proposed method show better results than existing methods.
基金Sponsored by the National Security Major Basic Research Project of China(Grant No.973 -61334)
文摘In this paper, an evolutionary recursive Bayesian estimation algorithm is presented, which incorporates the latest observation with a new proposal distribution, and the posterior state density is represented by a Gaussian mixture model that is recovered from the weighted particle set of the measurement update step by means of a weighted expectation-maximization algorithm. This step replaces the resampling stage needed by most particle filters and relieves the effect caused by sample impoverishment. A nonlinear tracking problem shows that this new approach outperforms other related particle filters.
文摘This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Frequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account.Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN.
基金supported by the Visvesvaraya Ph.D.Scheme for Electronics and IT students launched by the Ministry of Electronics and Information Technology(MeiTY),Government of India under Grant No.PhD-MLA/4(95)/2015-2016.
文摘In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predictive coding(LPC),linear prediction cepstrum coefficient(LPCC),perceptual linear prediction(PLP),and Mel frequency cepstral coefficient(MFCC).The 10-hour speech data were used for training and 3-hour data for testing.For each spectral feature,different hidden Markov model(HMM)based recognizers with variations in HMM states and different Gaussian mixture models(GMMs)were built.The performance was evaluated by using the word error rate(WER).The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.
基金supported by the National Natural Science Foundation of China(Grant No.60702032)the Natural Science Foundation of Heilongjiang Province(No.F201021)the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology(No.HIT.NSRIF.2008.63).
文摘Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal component analysis de-nosing algorithm to reduce white noise in the video.Next,we apply the Gaussian mixture model(GMM)to model the video and segment all foreground objects primarily.After that,we calculate von Mises distribution of the velocity vectors and ratio of the overlapped region with referring to the result of the primary segmentation and extract the interesting object.Finally,rain and snow streaks are inpainted using the background to improve the quality of the video data.Experiments show that the proposed method can effectively suppress noise and extract interesting targets.
文摘This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing procedures.Shared memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR algorithm.The experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved.