Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is...Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is that the number of communities K must be predefined manually.We propose a flexible nonparametric Bayesian generative model for count-value networks,which can allow K to increase as more and more data are encountered instead of to be fixed in advance.The Indian buffet process was used to model the community assignment matrix Z,and an uncol-lapsed Gibbs sampler has been derived.However,as the community assignment matrix Zis a structured multi-variable parameter,how to summarize the posterior inference results andestimate the inference quality about Z,is still a considerable challenge in the literature.In this paper,a graph convolutional neural network based graph classifier was utilized to help tosummarize the results and to estimate the inference qualityabout Z.We conduct extensive experiments on synthetic data and real data,and find that empirically,the traditional posterior summarization strategy is reliable.展开更多
The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especi...The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especially in high dimensional settings.In this paper,we show that under mild sparsity condition,the posterior distribution of the latent feature matrix,generated via IBP or pIBP priors,converges to the true latent feature matrix asymptotically.We derive the posterior convergence rate,referred to as the contraction rate.We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size,therefore making the posterior inference valid in high dimensional settings.We demonstrate the theoretical results using computer simulation,in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles.The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays(RPPA)dataset under the IBP prior model.展开更多
Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model ...Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.展开更多
基金supported by the National Basic Research Program of China(973)(2012CB316402)The National Natural Science Foundation of China(Grant Nos.61332005,61725205)+3 种基金The Research Project of the North Minzu University(2019XYZJK02,2019xYZJK05,2017KJ24,2017KJ25,2019MS002)Ningxia first-classdisciplinc and scientific research projects(electronic science and technology,NXYLXK2017A07)NingXia Provincial Key Discipline Project-Computer ApplicationThe Provincial Natural Science Foundation ofNingXia(NZ17111,2020AAC03219).
文摘Overlapping community detection has become a very hot research topic in recent decades,and a plethora of methods have been proposed.But,a common challenge in many existing overlapping community detection approaches is that the number of communities K must be predefined manually.We propose a flexible nonparametric Bayesian generative model for count-value networks,which can allow K to increase as more and more data are encountered instead of to be fixed in advance.The Indian buffet process was used to model the community assignment matrix Z,and an uncol-lapsed Gibbs sampler has been derived.However,as the community assignment matrix Zis a structured multi-variable parameter,how to summarize the posterior inference results andestimate the inference quality about Z,is still a considerable challenge in the literature.In this paper,a graph convolutional neural network based graph classifier was utilized to help tosummarize the results and to estimate the inference qualityabout Z.We conduct extensive experiments on synthetic data and real data,and find that empirically,the traditional posterior summarization strategy is reliable.
文摘The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especially in high dimensional settings.In this paper,we show that under mild sparsity condition,the posterior distribution of the latent feature matrix,generated via IBP or pIBP priors,converges to the true latent feature matrix asymptotically.We derive the posterior convergence rate,referred to as the contraction rate.We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size,therefore making the posterior inference valid in high dimensional settings.We demonstrate the theoretical results using computer simulation,in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles.The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays(RPPA)dataset under the IBP prior model.
基金Supported partially by the Post Doctoral Natural Science Foundation of China(2013M532118,2015T81082)the National Natural Science Foundation of China(61573364,61273177,61503066)+2 种基金the State Key Laboratory of Synthetical Automation for Process Industriesthe National High Technology Research and Development Program of China(2015AA043802)the Scientific Research Fund of Liaoning Provincial Education Department(L2013272)
文摘Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.