Based on Gaussian mixture models(GMM), speed, flow and occupancy are used together in the cluster analysis of traffic flow data. Compared with other clustering and sorting techniques, as a structural model, the GMM ...Based on Gaussian mixture models(GMM), speed, flow and occupancy are used together in the cluster analysis of traffic flow data. Compared with other clustering and sorting techniques, as a structural model, the GMM is suitable for various kinds of traffic flow parameters. Gap statistics and domain knowledge of traffic flow are used to determine a proper number of clusters. The expectation-maximization (E-M) algorithm is used to estimate parameters of the GMM model. The clustered traffic flow pattems are then analyzed statistically and utilized for designing maximum likelihood classifiers for grouping real-time traffic flow data when new observations become available. Clustering analysis and pattern recognition can also be used to cluster and classify dynamic traffic flow patterns for freeway on-ramp and off-ramp weaving sections as well as for other facilities or things involving the concept of level of service, such as airports, parking lots, intersections, interrupted-flow pedestrian facilities, etc.展开更多
To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed t...To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed to obtain the quantitative pore structure information from the NMR T;spectrums based on the Gaussian mixture model(GMM). Firstly, We conducted the principal component analysis on T;spectrums in order to reduce the dimension data and the dependence of the original variables. Secondly, the dimension-reduced data was fitted using the GMM probability density function, and the model parameters and optimal clustering numbers were obtained according to the expectation-maximization algorithm and the change of the Akaike information criterion. Finally, the T;spectrum features and pore structure types of different clustering groups were analyzed and compared with T;geometric mean and T;arithmetic mean. The effectiveness of the algorithm has been verified by numerical simulation and field NMR logging data. The research shows that the clustering results based on GMM method have good correlations with the shape and distribution of the T;spectrum, pore structure, and petroleum productivity, providing a new means for quantitative identification of pore structure, reservoir grading, and oil and gas productivity evaluation.展开更多
A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec-...A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.展开更多
Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex enviro...Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.展开更多
Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power in...Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power information which is important for the channel multipath clustering. In this paper, a normalized power weighted GMM (PGMM) is introduced to model the channel multipath components (MPCs). With MPC power as a weighted factor, the PGMM can fit the MPCs in accordance with the cluster-based channel models. Firstly, expectation maximization (EM) algorithm is employed to optimize the PGMM parameters. Then, to further increase the searching ability of EM and choose the optimal number of components without resort to cross-validation, the variational Bayesian (VB) inference is employed. Finally, 28 GHz indoor channel measurement data is used to demonstrate the effectiveness of the PGMM clustering algorithm.展开更多
Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is ...Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.展开更多
To learn from evolutionary experimental data points effectively,an evolutionary Gaussian mixture model based on constraint consistency(EGMM)is proposed and the corresponding method of parameter optimization is present...To learn from evolutionary experimental data points effectively,an evolutionary Gaussian mixture model based on constraint consistency(EGMM)is proposed and the corresponding method of parameter optimization is presented.Here,the Gaussian mixture model(GMM)is adopted to describe the data points,and the differences between the posterior probabilities of pairwise points under the current parameters are introduced to measure the temporal smoothness.Then,parameter optimization of EGMM can be realized by evolutionary clustering.Compared with most of the existing data analysis methods by evolutionary clustering,both the whole features and individual differences of data points are considered in the clustering framework of EGMM.It decreases the algorithm sensitivity to noises and increases the robustness of evaluated parameters.Experimental result shows that the clustering sequence really reflects the shift of data distribution,and the proposed algorithm can provide better clustering quality and temporal smoothness.展开更多
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters fo...Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.展开更多
A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region accor...A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region according to variance-sum criteria function of the feature vectors. Then pa-rameters of GMM are calculated by using the statistical information of these feature vectors. GMM predicts the regularization parameter for each pixel adaptively. Hopfield Neural Network (Hopfield-NN) is used to optimize the objective function of image restoration,and network weight value matrix is updated by the output of GMM. Since GMM is used,the regularization parameters share properties of different kind of regions. In addition,the regularization parameters are different from pixel to pixel. GMM-based regularization method is consistent with human visual system,and it has strong gener-alization capability. Comparing with non-adaptive and some adaptive image restoration algorithms,experimental results show that the proposed algorithm obtains more preferable restored images.展开更多
This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online ...This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online sample by sample, instead of waiting for a block of data with the sufficient size to start training as in the traditional EM procedure. The proposed method is extended from the split-and-merge EM procedure, so inherently it is also capable escaping from local maxima and reducing the chances of singularities. In the application domain, the algorithm is optimized in the context of speech processing applications. Experiments on the synthetic data show the advantage and efficiency of the new method and the results in a speech processing task also confirm the improvement of system performance.展开更多
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ...Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.展开更多
Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk...Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.展开更多
In order to implement the robust cluster analysis,solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation,and therefore affect the accuracy of cl...In order to implement the robust cluster analysis,solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation,and therefore affect the accuracy of clustering,a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model.This model firstly adopts the t-distribution as the submodel which tail is easily controllable.On this basis,it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters.After that,this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm,in order to select high confidence samples from each component in training.Finally,experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods.It provides significant guidance for the construction of the robust mixture distribution model.展开更多
In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use in...In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use intrapersonal palmprint deformations to train the global GMM instead of modeling GMMs for every class. To reduce the dimension of such variations while preserving density function of sample space, Principle Component Analysis (PCA) is used to find the principle differences and form the Intrapersonal Deformation Subspace (IDS). After training GMM using Expectation Maximization (EM) algorithm in IDS, a maximum likelihood strategy is carried out to identify a person. Experimental results demonstrate the advantage of our method compared with traditional PCA method and single Gaussian strategy.展开更多
This paper is concerned about studying modeling-based methods in cluster analysis to classify data elements into clusters and thus dealing with time series in view of this classification to choose the appropriate mixe...This paper is concerned about studying modeling-based methods in cluster analysis to classify data elements into clusters and thus dealing with time series in view of this classification to choose the appropriate mixed model. The mixture-model cluster analysis technique under different covariance structures of the component densities is presented. This model is used to capture the compactness, orientation, shape, and the volume of component clusters in one expert system to handle Gaussian high dimensional heterogeneous data set. To achieve flexibility in currently practiced cluster analysis techniques. The Expectation-Maximization (EM) algorithm is considered to estimate the parameter of the covariance matrix. To judge the goodness of the models, some criteria are used. These criteria are for the covariance matrix produced by the simulation. These models have not been tackled in previous studies. The results showed the superiority criterion ICOMP PEU to other criteria.<span> </span><span>This is in addition to the success of the model based on Gaussian clusters in the prediction by using covariance matrices used in this study. The study also found the possibility of determining the optimal number of clusters by choosing the number of clusters corresponding to lower values </span><span><span><span>for the different criteria used in the study</span></span></span><span><span><span>.展开更多
基金The US National Science Foundation (No. CMMI-0408390,CMMI-0644552)the American Chemical Society Petroleum Research Foundation (No.PRF-44468-G9)+3 种基金the Research Fellowship for International Young Scientists (No.51050110143)the Fok Ying-Tong Education Foundation (No.114024)the Natural Science Foundation of Jiangsu Province (No.BK2009015)the Postdoctoral Science Foundation of Jiangsu Province (No.0901005C)
文摘Based on Gaussian mixture models(GMM), speed, flow and occupancy are used together in the cluster analysis of traffic flow data. Compared with other clustering and sorting techniques, as a structural model, the GMM is suitable for various kinds of traffic flow parameters. Gap statistics and domain knowledge of traffic flow are used to determine a proper number of clusters. The expectation-maximization (E-M) algorithm is used to estimate parameters of the GMM model. The clustered traffic flow pattems are then analyzed statistically and utilized for designing maximum likelihood classifiers for grouping real-time traffic flow data when new observations become available. Clustering analysis and pattern recognition can also be used to cluster and classify dynamic traffic flow patterns for freeway on-ramp and off-ramp weaving sections as well as for other facilities or things involving the concept of level of service, such as airports, parking lots, intersections, interrupted-flow pedestrian facilities, etc.
基金Supported by the National Natural Science Foundation of China (42174142)National Science and Technology Major Project (2017ZX05039-002)+2 种基金Operation Fund of China National Petroleum Corporation Logging Key Laboratory (2021DQ20210107-11)Fundamental Research Funds for Central Universities (19CX02006A)Major Science and Technology Project of China National Petroleum Corporation (ZD2019-183-006)。
文摘To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed to obtain the quantitative pore structure information from the NMR T;spectrums based on the Gaussian mixture model(GMM). Firstly, We conducted the principal component analysis on T;spectrums in order to reduce the dimension data and the dependence of the original variables. Secondly, the dimension-reduced data was fitted using the GMM probability density function, and the model parameters and optimal clustering numbers were obtained according to the expectation-maximization algorithm and the change of the Akaike information criterion. Finally, the T;spectrum features and pore structure types of different clustering groups were analyzed and compared with T;geometric mean and T;arithmetic mean. The effectiveness of the algorithm has been verified by numerical simulation and field NMR logging data. The research shows that the clustering results based on GMM method have good correlations with the shape and distribution of the T;spectrum, pore structure, and petroleum productivity, providing a new means for quantitative identification of pore structure, reservoir grading, and oil and gas productivity evaluation.
基金Supported by the National Natural Science Foundation of China(60505004,60773061)~~
文摘A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.
基金supported by the National Natural Science Foundation of China under Grant No.62273083 and No.61973069Natural Science Foundation of Hebei Province under Grant No.F2020501012。
文摘Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.
基金supported by National Science and Technology Major Program of the Ministry of Science and Technology (No.2018ZX03001031)Key program of Beijing Municipal Natural Science Foundation (No. L172030)+2 种基金Beijing Municipal Science & Technology Commission Project (No. Z171100005217001)Key Project of State Key Lab of Networking and Switching Technology (NST20170205)National Key Technology Research and Development Program of the Ministry of Science and Technology of China (NO. 2012BAF14B01)
文摘Cluster-based channel model is the main stream of fifth generation mobile communications, thus the accuracy of clustering algorithm is important. Traditional Gaussian mixture model (GMM) does not consider the power information which is important for the channel multipath clustering. In this paper, a normalized power weighted GMM (PGMM) is introduced to model the channel multipath components (MPCs). With MPC power as a weighted factor, the PGMM can fit the MPCs in accordance with the cluster-based channel models. Firstly, expectation maximization (EM) algorithm is employed to optimize the PGMM parameters. Then, to further increase the searching ability of EM and choose the optimal number of components without resort to cross-validation, the variational Bayesian (VB) inference is employed. Finally, 28 GHz indoor channel measurement data is used to demonstrate the effectiveness of the PGMM clustering algorithm.
基金the National Natural Science Foundation of China(61771367)the Science and Technology on Communication Networks Laboratory(HHS19641X003).
文摘Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.
基金Supported by the National Natural Science Foundation of China(61202137)the Open Project Foundation of Information Technology Research Base of Civil Aviation Administration of China(CAAC-ITRB-201302)+1 种基金the University Natural Science Basic Research Project of Jiangsu Province(13KJB520004)the Fundamental Research Funds for the Central Universities(NS2012134)
文摘To learn from evolutionary experimental data points effectively,an evolutionary Gaussian mixture model based on constraint consistency(EGMM)is proposed and the corresponding method of parameter optimization is presented.Here,the Gaussian mixture model(GMM)is adopted to describe the data points,and the differences between the posterior probabilities of pairwise points under the current parameters are introduced to measure the temporal smoothness.Then,parameter optimization of EGMM can be realized by evolutionary clustering.Compared with most of the existing data analysis methods by evolutionary clustering,both the whole features and individual differences of data points are considered in the clustering framework of EGMM.It decreases the algorithm sensitivity to noises and increases the robustness of evaluated parameters.Experimental result shows that the clustering sequence really reflects the shift of data distribution,and the proposed algorithm can provide better clustering quality and temporal smoothness.
基金Supported by the National High Technology Research and Development Program of China (863 Program,No.2006AA010102)
文摘Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.
文摘A GMM (Gaussian Mixture Model) based adaptive image restoration is proposed in this paper. The feature vectors of pixels are selected and extracted. Pixels are clustered into smooth,edge or detail texture region according to variance-sum criteria function of the feature vectors. Then pa-rameters of GMM are calculated by using the statistical information of these feature vectors. GMM predicts the regularization parameter for each pixel adaptively. Hopfield Neural Network (Hopfield-NN) is used to optimize the objective function of image restoration,and network weight value matrix is updated by the output of GMM. Since GMM is used,the regularization parameters share properties of different kind of regions. In addition,the regularization parameters are different from pixel to pixel. GMM-based regularization method is consistent with human visual system,and it has strong gener-alization capability. Comparing with non-adaptive and some adaptive image restoration algorithms,experimental results show that the proposed algorithm obtains more preferable restored images.
文摘This paper presents a new online incremental training algorithm of Gaussian mixture model (GMM), which aims to perform the expectation-maximization(EM) training incrementally to update GMM model parameters online sample by sample, instead of waiting for a block of data with the sufficient size to start training as in the traditional EM procedure. The proposed method is extended from the split-and-merge EM procedure, so inherently it is also capable escaping from local maxima and reducing the chances of singularities. In the application domain, the algorithm is optimized in the context of speech processing applications. Experiments on the synthetic data show the advantage and efficiency of the new method and the results in a speech processing task also confirm the improvement of system performance.
文摘Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.
基金Supported by the National Basic Research Program of China(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.
基金Supported by the 13th 5-Year National Science and Technology Supporting Project(2018YFC2000302)。
文摘In order to implement the robust cluster analysis,solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation,and therefore affect the accuracy of clustering,a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model.This model firstly adopts the t-distribution as the submodel which tail is easily controllable.On this basis,it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters.After that,this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm,in order to select high confidence samples from each component in training.Finally,experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods.It provides significant guidance for the construction of the robust mixture distribution model.
文摘In this paper, an efficient model of palmprint identification is presented based on subspace density estimation using Gaussian Mixture Model (GMM). While a few training samples are available for each person, we use intrapersonal palmprint deformations to train the global GMM instead of modeling GMMs for every class. To reduce the dimension of such variations while preserving density function of sample space, Principle Component Analysis (PCA) is used to find the principle differences and form the Intrapersonal Deformation Subspace (IDS). After training GMM using Expectation Maximization (EM) algorithm in IDS, a maximum likelihood strategy is carried out to identify a person. Experimental results demonstrate the advantage of our method compared with traditional PCA method and single Gaussian strategy.
文摘This paper is concerned about studying modeling-based methods in cluster analysis to classify data elements into clusters and thus dealing with time series in view of this classification to choose the appropriate mixed model. The mixture-model cluster analysis technique under different covariance structures of the component densities is presented. This model is used to capture the compactness, orientation, shape, and the volume of component clusters in one expert system to handle Gaussian high dimensional heterogeneous data set. To achieve flexibility in currently practiced cluster analysis techniques. The Expectation-Maximization (EM) algorithm is considered to estimate the parameter of the covariance matrix. To judge the goodness of the models, some criteria are used. These criteria are for the covariance matrix produced by the simulation. These models have not been tackled in previous studies. The results showed the superiority criterion ICOMP PEU to other criteria.<span> </span><span>This is in addition to the success of the model based on Gaussian clusters in the prediction by using covariance matrices used in this study. The study also found the possibility of determining the optimal number of clusters by choosing the number of clusters corresponding to lower values </span><span><span><span>for the different criteria used in the study</span></span></span><span><span><span>.