In this paper,to obtain a consistent estimator of the number of communities,the authors present a new sequential testing procedure,based on the locally smoothed adjacency matrix and the extreme value theory.Under the ...In this paper,to obtain a consistent estimator of the number of communities,the authors present a new sequential testing procedure,based on the locally smoothed adjacency matrix and the extreme value theory.Under the null hypothesis,the test statistic converges to the type I extreme value distribution,and otherwise,it explodes fast and the divergence rate could even reach n in the strong signal case where n is the size of the network,guaranteeing high detection power.This method is simple to use and serves as an alternative approach to the novel one in Lei(2016)using random matrix theory.To detect the change of the community structure,the authors also propose a two-sample test for the stochastic block model with two observed adjacency matrices.Simulation studies justify the theory.The authors apply the proposed method to the political blog data set and find reasonable group structures.展开更多
Online social networks have attracted great attention recently, because they make it easy to build social connections for people all over the world. However, the observed structure of an online social network is alway...Online social networks have attracted great attention recently, because they make it easy to build social connections for people all over the world. However, the observed structure of an online social network is always the aggregation of multiple social relationships. Thus, it is of great importance for real-world networks to reconstruct the full network structure using limited observations. The multiplex stochastic block model is introduced to describe multiple social ties, where different layers correspond to different attributes(e.g., age and gender of users in a social network). In this letter, we aim to improve the model precision using maximum likelihood estimation, where the precision is defined by the cross entropy of parameters between the data and model. Within this framework, the layers and partitions of nodes in a multiplex network are determined by natural node annotations, and the aggregate of the multiplex network is available. Because the original multiplex network has a high degree of freedom, we add an independent functional layer to cover it, and theoretically provide the optimal block number of the added layer.Empirical results verify the effectiveness of the proposed method using four measures, i.e., error of link probability,cross entropy, area under the receiver operating characteristic curve, and Bayes factor.展开更多
在对网络无任何先验知识情形下,PPSB-DC模型(popularity and productivity stochastic block model and discriminative content model)利用网络的内容和链接对网络生成过程进行建模,可有效地发现广义社区及社区间的链接模式。但该概率...在对网络无任何先验知识情形下,PPSB-DC模型(popularity and productivity stochastic block model and discriminative content model)利用网络的内容和链接对网络生成过程进行建模,可有效地发现广义社区及社区间的链接模式。但该概率模型的参数估计算法耗时,初始链接模式参数设置敏感,限制了该模型的应用。对参数求解算法进行了改进,设计了一个有效的内容网络广义社区发现算法EPPSBDC(efficient PPSB-DC)。该算法通过采取抽样和并行技术,提高了算法运行速度,通过引入链接概率先验,消除了算法对初始参数的敏感性。在内容网络上与同类算法进行了比较,验证了EPPSBDC算法的有效性。展开更多
基金supported by the National Natural Science Foundation of China under Grant No.71971118supported by Major Natural Science Projects of Universities in Jiangsu Province under Grant No.20KJA520002。
文摘In this paper,to obtain a consistent estimator of the number of communities,the authors present a new sequential testing procedure,based on the locally smoothed adjacency matrix and the extreme value theory.Under the null hypothesis,the test statistic converges to the type I extreme value distribution,and otherwise,it explodes fast and the divergence rate could even reach n in the strong signal case where n is the size of the network,guaranteeing high detection power.This method is simple to use and serves as an alternative approach to the novel one in Lei(2016)using random matrix theory.To detect the change of the community structure,the authors also propose a two-sample test for the stochastic block model with two observed adjacency matrices.Simulation studies justify the theory.The authors apply the proposed method to the political blog data set and find reasonable group structures.
基金Project supported by the National Natural Science Foundation of China (No. 61731004)。
文摘Online social networks have attracted great attention recently, because they make it easy to build social connections for people all over the world. However, the observed structure of an online social network is always the aggregation of multiple social relationships. Thus, it is of great importance for real-world networks to reconstruct the full network structure using limited observations. The multiplex stochastic block model is introduced to describe multiple social ties, where different layers correspond to different attributes(e.g., age and gender of users in a social network). In this letter, we aim to improve the model precision using maximum likelihood estimation, where the precision is defined by the cross entropy of parameters between the data and model. Within this framework, the layers and partitions of nodes in a multiplex network are determined by natural node annotations, and the aggregate of the multiplex network is available. Because the original multiplex network has a high degree of freedom, we add an independent functional layer to cover it, and theoretically provide the optimal block number of the added layer.Empirical results verify the effectiveness of the proposed method using four measures, i.e., error of link probability,cross entropy, area under the receiver operating characteristic curve, and Bayes factor.
文摘在对网络无任何先验知识情形下,PPSB-DC模型(popularity and productivity stochastic block model and discriminative content model)利用网络的内容和链接对网络生成过程进行建模,可有效地发现广义社区及社区间的链接模式。但该概率模型的参数估计算法耗时,初始链接模式参数设置敏感,限制了该模型的应用。对参数求解算法进行了改进,设计了一个有效的内容网络广义社区发现算法EPPSBDC(efficient PPSB-DC)。该算法通过采取抽样和并行技术,提高了算法运行速度,通过引入链接概率先验,消除了算法对初始参数的敏感性。在内容网络上与同类算法进行了比较,验证了EPPSBDC算法的有效性。