A nonparametric Bayesian method is presented to classify the MPSK (M-ary phase shift keying) signals. The MPSK signals with unknown signal noise ratios (SNRs) are modeled as a Gaussian mixture model with unknown m...A nonparametric Bayesian method is presented to classify the MPSK (M-ary phase shift keying) signals. The MPSK signals with unknown signal noise ratios (SNRs) are modeled as a Gaussian mixture model with unknown means and covariances in the constellation plane, and a clustering method is proposed to estimate the probability density of the MPSK signals. The method is based on the nonparametric Bayesian inference, which introduces the Dirichlet process as the prior probability of the mixture coefficient, and applies a normal inverse Wishart (NIW) distribution as the prior probability of the unknown mean and covariance. Then, according to the received signals, the parameters are adjusted by the Monte Carlo Markov chain (MCMC) random sampling algorithm. By iterations, the density estimation of the MPSK signals can be estimated. Simulation results show that the correct recognition ratio of 2/4/8PSK is greater than 95% under the condition that SNR 〉5 dB and 1 600 symbols are used in this method.展开更多
Although Relevance Vector Machine (RVM) is the most popular algorithms in machine learning and computer vision, outliers in the training data make the estimation unreliable. In the paper, a robust RVM model under non-...Although Relevance Vector Machine (RVM) is the most popular algorithms in machine learning and computer vision, outliers in the training data make the estimation unreliable. In the paper, a robust RVM model under non-parametric Bayesian framework is proposed. We decompose the noise term in the RVM model into two components, a Gaussian noise term and a spiky noise term. Therefore the observed data is assumed represented as: where is the relevance vector component, of which is the kernel function matrix and is the weight matrix, is the spiky term and is the Gaussian noise term. A spike-slab sparse prior is imposed on the weight vector which gives a more intuitive constraint on the sparsity than the Student's t-distribution described in the traditional RVM. For the spiky component a spike-slab sparse prior is also introduced to recognize outliers in the training data effectively. Several experiments demonstrate the better performance over the RVM regression.展开更多
This study proposes a full Bayesian nonparametric procedure to investigate the predictive power of exchange rates in relation to commodity prices for three commodity-exporting countries:Canada,Australia,and New Zealan...This study proposes a full Bayesian nonparametric procedure to investigate the predictive power of exchange rates in relation to commodity prices for three commodity-exporting countries:Canada,Australia,and New Zealand.We propose a new time-dependent infinite mixture of a normal linear regression model of the conditional distribution of the commodity price index.The mixing weights follow a set of Probit stick-breaking priors that are time-varying.We find that exchange rates have a positive predictive effect in general,but accounting for time variation does not improve forecasting performance.By contrast,the intercept in the regression and the lagged dependent variable show signs of parameter change over time in most cases,which is important in forecasting both the mean and the density of commodity prices one period ahead.The results also suggest that the variance is a large source of the time variation in the conditional distribution of commodity prices.展开更多
Microarray gene expression data are analyzed by means of a Bayesian nonparametric model, with emphasis on prediction of future observables, yielding a method for selection of differentially expressed genes and the cor...Microarray gene expression data are analyzed by means of a Bayesian nonparametric model, with emphasis on prediction of future observables, yielding a method for selection of differentially expressed genes and the corresponding classifier.展开更多
Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is...Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is that the number of communities K must be predefined manually.We propose a flexible nonparametric Bayesian generative model for count-value networks,which can allow K to increase as more and more data are encountered instead of to be fixed in advance.The Indian buffet process was used to model the community assignment matrix Z,and an uncol-lapsed Gibbs sampler has been derived.However,as the community assignment matrix Zis a structured multi-variable parameter,how to summarize the posterior inference results andestimate the inference quality about Z,is still a considerable challenge in the literature.In this paper,a graph convolutional neural network based graph classifier was utilized to help tosummarize the results and to estimate the inference qualityabout Z.We conduct extensive experiments on synthetic data and real data,and find that empirically,the traditional posterior summarization strategy is reliable.展开更多
This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be infe...This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonpara-metric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as well as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more fine-grained topic rela-tionships compared to the hierarchical latent Dirichlet allocation model.展开更多
In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mi...In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the "correct" number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.展开更多
基金Cultivation Fund of the Key Scientific and Technical Innovation Project of Ministry of Education of China(No.3104001014)
文摘A nonparametric Bayesian method is presented to classify the MPSK (M-ary phase shift keying) signals. The MPSK signals with unknown signal noise ratios (SNRs) are modeled as a Gaussian mixture model with unknown means and covariances in the constellation plane, and a clustering method is proposed to estimate the probability density of the MPSK signals. The method is based on the nonparametric Bayesian inference, which introduces the Dirichlet process as the prior probability of the mixture coefficient, and applies a normal inverse Wishart (NIW) distribution as the prior probability of the unknown mean and covariance. Then, according to the received signals, the parameters are adjusted by the Monte Carlo Markov chain (MCMC) random sampling algorithm. By iterations, the density estimation of the MPSK signals can be estimated. Simulation results show that the correct recognition ratio of 2/4/8PSK is greater than 95% under the condition that SNR 〉5 dB and 1 600 symbols are used in this method.
基金Supported by the National Natural Science Foundation of China (No. 30900328, 61172179)the Fundamental Research Funds for the Central Universities (No. 201112-1051)the Natural Science Foundation of Fujian Province of China (No. 2012J05160)
文摘Although Relevance Vector Machine (RVM) is the most popular algorithms in machine learning and computer vision, outliers in the training data make the estimation unreliable. In the paper, a robust RVM model under non-parametric Bayesian framework is proposed. We decompose the noise term in the RVM model into two components, a Gaussian noise term and a spiky noise term. Therefore the observed data is assumed represented as: where is the relevance vector component, of which is the kernel function matrix and is the weight matrix, is the spiky term and is the Gaussian noise term. A spike-slab sparse prior is imposed on the weight vector which gives a more intuitive constraint on the sparsity than the Student's t-distribution described in the traditional RVM. For the spiky component a spike-slab sparse prior is also introduced to recognize outliers in the training data effectively. Several experiments demonstrate the better performance over the RVM regression.
基金The author acknowledges financial support from the National Natural Science Foundation of China(NSFC,No.71773069).
文摘This study proposes a full Bayesian nonparametric procedure to investigate the predictive power of exchange rates in relation to commodity prices for three commodity-exporting countries:Canada,Australia,and New Zealand.We propose a new time-dependent infinite mixture of a normal linear regression model of the conditional distribution of the commodity price index.The mixing weights follow a set of Probit stick-breaking priors that are time-varying.We find that exchange rates have a positive predictive effect in general,but accounting for time variation does not improve forecasting performance.By contrast,the intercept in the regression and the lagged dependent variable show signs of parameter change over time in most cases,which is important in forecasting both the mean and the density of commodity prices one period ahead.The results also suggest that the variance is a large source of the time variation in the conditional distribution of commodity prices.
文摘Microarray gene expression data are analyzed by means of a Bayesian nonparametric model, with emphasis on prediction of future observables, yielding a method for selection of differentially expressed genes and the corresponding classifier.
基金supported by the National Basic Research Program of China(973)(2012CB316402)The National Natural Science Foundation of China(Grant Nos.61332005,61725205)+3 种基金The Research Project of the North Minzu University(2019XYZJK02,2019xYZJK05,2017KJ24,2017KJ25,2019MS002)Ningxia first-classdisciplinc and scientific research projects(electronic science and technology,NXYLXK2017A07)NingXia Provincial Key Discipline Project-Computer ApplicationThe Provincial Natural Science Foundation ofNingXia(NZ17111,2020AAC03219).
文摘Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is that the number of communities K must be predefined manually.We propose a flexible nonparametric Bayesian generative model for count-value networks,which can allow K to increase as more and more data are encountered instead of to be fixed in advance.The Indian buffet process was used to model the community assignment matrix Z,and an uncol-lapsed Gibbs sampler has been derived.However,as the community assignment matrix Zis a structured multi-variable parameter,how to summarize the posterior inference results andestimate the inference quality about Z,is still a considerable challenge in the literature.In this paper,a graph convolutional neural network based graph classifier was utilized to help tosummarize the results and to estimate the inference qualityabout Z.We conduct extensive experiments on synthetic data and real data,and find that empirically,the traditional posterior summarization strategy is reliable.
基金Project (No. 60773180) supported by the National Natural Science Foundation of China
文摘This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonpara-metric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as well as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more fine-grained topic rela-tionships compared to the hierarchical latent Dirichlet allocation model.
基金supported by Gatsby Charitable Foundation and PASCAL2
文摘In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the "correct" number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.