How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we ...How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we proposed a new method to extract communities. The MCL algorithm, which is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm is used to extract communities. By putting mirror deleting procedure behind graph clustering, we decrease comparing cost considerably. After MCL and mirror deletion, we use community member select algorithm to produce the sets of community candidates. The experiment and results show the new method works effectively and properly.展开更多
Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hie...Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hierarchical, use the distance function to measure the dissimilarities among actors. These distance functions need to fulfill various properties, including the triangle inequality (TI). However, in some cases, the triangle inequality might be violated, impacting the quality of the resulting clusters. With experiments, this paper explains how TI violates while performing traditional clustering techniques: k-medoids, hierarchical, DENGRAPH, and spectral clustering on social networks and how the violation of TI affects the quality of the resulting clusters.展开更多
Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weig...Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.展开更多
Identifying composite crosscutting concerns(CCs) is a research task and challenge of aspect mining.In this paper,we propose a scatter-based graph clustering approach to identify composite CCs.Inspired by the state-o...Identifying composite crosscutting concerns(CCs) is a research task and challenge of aspect mining.In this paper,we propose a scatter-based graph clustering approach to identify composite CCs.Inspired by the state-of-the-art link analysis tech-niques,we propose a two-state model to approximate how CCs tangle with core modules.According to this model,we obtain scatter and centralization scores for each program element.Espe-cially,the scatter scores are adopted to select CC seeds.Further-more,to identify composite CCs,we adopt a novel similarity measurement and develop an undirected graph clustering to group these seeds.Finally,we compare it with the previous work and illustrate its effectiveness in identifying composite CCs.展开更多
Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clust...Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clustering loses locality and is impractical for large graphs.While some attempts have been made to address the efficiency bottleneck,there is still no applicable algorithm for large scale graphs with billions of edges.In this paper,we propose a purely local and index-free method called Index-free Triangle-based Graph Local Clustering(TGLC^(*))to solve the MGLC problem w.r.t.a triangle.TGLC^(*)directly estimates the Personalized PageRank(PPR)vector using random walks with the desired triangleweighted distribution and proposes the clustering result using a standard sweep procedure.We demonstrate TGLC^(*)’s scalability through theoretical analysis and its practical benefits through a novel visualization layout.TGLC^(*)is the first algorithm to solve the MGLC problem without precomputing the motif weight.Extensive experiments on seven real-world large-scale datasets show that TGLC^(*)is applicable and scalable for large graphs.展开更多
As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this pa...As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this paper proposes an SCM-oriented dynamic supply-demand(SD)intelligent adaptation model for massive manufacturing services.In this model,a collaborative network model is established based on the properties of both the supply-demand and their relationships;in addition,an algorithm based on deep graph clustering(DGC)and aligned sampling(AS)is used to divide and conquer the large adaptation domain to solve the problem of the slow computational speed caused by the high complexity of spatiotemporal search in the collaborative network model.At the same time,an intelligent supply-demand adaptation method driven by the quality of service(QoS)is established,in which the experiences of adaptation are shared among adaptation subdomains through deep reinforcement learning(DRL)powered by a transfer mechanism to improve the poor adaptation results caused by dynamic uncertainty.The results show that the model and the solution proposed in this paper can performcollaborative and intelligent supply-demand adaptation for themassive and dynamic resources in SCM through autonomous learning and can effectively performglobal supply-demand matching and optimal resource allocation.展开更多
Graph sparsification is to approximate an arbitrary graph by a sparse graph and is useful in many applications,such as simplification of social networks,least squares problems,and numerical solution of symmetric posit...Graph sparsification is to approximate an arbitrary graph by a sparse graph and is useful in many applications,such as simplification of social networks,least squares problems,and numerical solution of symmetric positive definite linear systems.In this paper,inspired by the well-known sparse signal recovery algorithm called orthogonal matching pursuit(OMP),we introduce a deterministic,greedy edge selection algorithm,which is called the universal greedy approach(UGA)for the graph sparsification problem.For a general spectral sparsification problem,e.g.,the positive subset selection problem from a set of m vectors in R n,we propose a nonnegative UGA algorithm which needs O(mn^(2)+n^(3)/ϵ^(2))time to find a 1+ϵ/β/1-ϵ/β-spectral sparsifier with positive coefficients with sparsity at most[n/ϵ^(2)],where β is the ratio between the smallest length and largest length of the vectors.The convergence of the nonnegative UGA algorithm is established.For the graph sparsification problem,another UGA algorithm is proposed which can output a 1+O(ϵ)/1-O(ϵ)-spectral sparsifier with[n/ϵ^(2)]edges in O(m+n^(2)/ϵ^(2))time from a graph with m edges and n vertices under some mild assumptions.This is a linear time algorithm in terms of the number of edges that the community of graph sparsification is looking for.The best result in the literature to the knowledge of the authors is the existence of a deterministic algorithm which is almost linear,i.e.O(m^(1+o(1)))for some o(1)=O((log log(m))^(2/3)/log^(1/3)(m)).Finally,extensive experimental results,including applications to graph clustering and least squares regression,show the effectiveness of proposed approaches.展开更多
Scene-based recommendation has proven its usefulness in E-commerce,by recommending commodities based on a given scene.However,scenes are typically unknown in advance,which necessitates scene discovery for E-commerce.I...Scene-based recommendation has proven its usefulness in E-commerce,by recommending commodities based on a given scene.However,scenes are typically unknown in advance,which necessitates scene discovery for E-commerce.In this article,we study scene discovery for E-commerce systems.We first formalize a scene as a set of commodity cate-gories that occur simultaneously and frequently in real-world situations,and model an E-commerce platform as a heteroge-neous information network(HIN),whose nodes and links represent different types of objects and different types of rela-tionships between objects,respectively.We then formulate the scene mining problem for E-commerce as an unsupervised learning problem that finds the overlapping clusters of commodity categories in the HIN.To solve the problem,we pro-pose a non-negative matrix factorization based method SMEC(Scene Mining for E-Commerce),and theoretically prove its convergence.Using six real-world E-commerce datasets,we finally conduct an extensive experimental study to evaluate SMEC against 13 other methods,and show that SMEC consistently outperforms its competitors with regard to various evaluation measures.展开更多
Rapidly identifying protein complexes is significant to elucidate the mechanisms of macromolecular interactions and to further investigate the overlapping clinical manifestations of diseases.To date,existing computati...Rapidly identifying protein complexes is significant to elucidate the mechanisms of macromolecular interactions and to further investigate the overlapping clinical manifestations of diseases.To date,existing computational methods majorly focus on developing unsupervised graph clustering algorithms,sometimes in combination with prior biological insights,to detect protein complexes from protein-protein interaction(PPI)networks.However,the outputs of these methods are potentially structural or functional modules within PPI networks.These modules do not necessarily correspond to the actual protein complexes that are formed via spatiotemporal aggregation of subunits.In this study,we propose a computational framework that combines supervised learning and dense subgraphs discovery to predict protein complexes.The proposed framework consists of two steps.The first step reconstructs genome-scale protein co-complex networks via training a supervised learning model of l2-regularized logistic regression on experimentally derived co-complexed protein pairs;and the second step infers hierarchical and balanced clusters as complexes from the co-complex networks via effective but computationally intensive k-clique graph clustering method or efficient maximum modularity clustering(MMC)algorithm.Empirical studies of cross validation and independent test show that both steps achieve encouraging performance.The proposed framework is fundamentally novel and excels over existing methods in that the complexes inferred from protein co-complex networks are more biologically relevant than those inferred from PPI networks,providing a new avenue for identifying novel protein complexes.展开更多
Proteins usually bind together to form complexes, which play an important role in cellular activities. Many graph clustering methods have been proposed to identify protein complexes by finding dense regions in protein...Proteins usually bind together to form complexes, which play an important role in cellular activities. Many graph clustering methods have been proposed to identify protein complexes by finding dense regions in protein-protein interaction networks. We present a novel framework (CPL) that detects protein complexes by propagating labels through interactions in a network, in which labels denote complex identifiers. With proper propagation in CPL, proteins in the same complex will be assigned with the same labels. CPL does not make any strong assumptions about the topological structures of the complexes, as in previous methods. Tile CPL algorithm is tested on several publicly available yeast protein-protein interaction networks and compared with several state-of-the-art methods. The results suggest that CPL performs better than the existing methods. An analysis of the functional homogeneity based on a gene ontology analysis shows that the detected complexes of CPL are highly biologically relevant.展开更多
Graph clustering has been widely applied in exploring regularities emerging in relational data.Recently,the rapid development of network theory correlates graph clustering with the detection of community structure,a c...Graph clustering has been widely applied in exploring regularities emerging in relational data.Recently,the rapid development of network theory correlates graph clustering with the detection of community structure,a common and important topological characteristic of networks.Most existing methods investigate the community structure at a single topological scale.However,as shown by empirical studies,the community structure of real world networks often exhibits multiple topological descriptions,corresponding to the clustering at different resolutions.Furthermore,the detection of multiscale community structure is heavily affected by the heterogeneous distribution of node degree.It is very challenging to detect multiscale community structure in heterogeneous networks.In this paper,we propose a novel,unified framework for detecting community structure from the perspective of dimensionality reduction.Based on the framework,we first prove that the well-known Laplacian matrix for network partition and the widely-used modularity matrix for community detection are two kinds of covariance matrices used in dimensionality reduction.We then propose a novel method to detect communities at multiple topological scales within our framework.We further show that existing algorithms fail to deal with heterogeneous node degrees.We develop a novel method to handle heterogeneity of networks by introducing a rescaling transformation into the covariance matrices in our framework.Extensive tests on real world and artificial networks demonstrate that the proposed correlation matrices significantly outperform Laplacian and modularity matrices in terms of their ability to identify multiscale community structure in heterogeneous networks.展开更多
The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale netwo...The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.展开更多
基金Supported bythe 211 Project of Ministry of Educa-tion of China
文摘How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we proposed a new method to extract communities. The MCL algorithm, which is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm is used to extract communities. By putting mirror deleting procedure behind graph clustering, we decrease comparing cost considerably. After MCL and mirror deletion, we use community member select algorithm to produce the sets of community candidates. The experiment and results show the new method works effectively and properly.
文摘Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hierarchical, use the distance function to measure the dissimilarities among actors. These distance functions need to fulfill various properties, including the triangle inequality (TI). However, in some cases, the triangle inequality might be violated, impacting the quality of the resulting clusters. With experiments, this paper explains how TI violates while performing traditional clustering techniques: k-medoids, hierarchical, DENGRAPH, and spectral clustering on social networks and how the violation of TI affects the quality of the resulting clusters.
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.
基金Supported by the National Pre-research Project (513150601)
文摘Identifying composite crosscutting concerns(CCs) is a research task and challenge of aspect mining.In this paper,we propose a scatter-based graph clustering approach to identify composite CCs.Inspired by the state-of-the-art link analysis tech-niques,we propose a two-state model to approximate how CCs tangle with core modules.According to this model,we obtain scatter and centralization scores for each program element.Espe-cially,the scatter scores are adopted to select CC seeds.Further-more,to identify composite CCs,we adopt a novel similarity measurement and develop an undirected graph clustering to group these seeds.Finally,we compare it with the previous work and illustrate its effectiveness in identifying composite CCs.
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clustering loses locality and is impractical for large graphs.While some attempts have been made to address the efficiency bottleneck,there is still no applicable algorithm for large scale graphs with billions of edges.In this paper,we propose a purely local and index-free method called Index-free Triangle-based Graph Local Clustering(TGLC^(*))to solve the MGLC problem w.r.t.a triangle.TGLC^(*)directly estimates the Personalized PageRank(PPR)vector using random walks with the desired triangleweighted distribution and proposes the clustering result using a standard sweep procedure.We demonstrate TGLC^(*)’s scalability through theoretical analysis and its practical benefits through a novel visualization layout.TGLC^(*)is the first algorithm to solve the MGLC problem without precomputing the motif weight.Extensive experiments on seven real-world large-scale datasets show that TGLC^(*)is applicable and scalable for large graphs.
基金This paper was supported in part by the National Natural Science Foundation of China under Grant 62172235in part by Natural Science Foundation of Jiangsu Province of China under Grant BK20191381in part by Primary Research&Development Plan of Jiangsu Province Grant BE2019742.
文摘As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this paper proposes an SCM-oriented dynamic supply-demand(SD)intelligent adaptation model for massive manufacturing services.In this model,a collaborative network model is established based on the properties of both the supply-demand and their relationships;in addition,an algorithm based on deep graph clustering(DGC)and aligned sampling(AS)is used to divide and conquer the large adaptation domain to solve the problem of the slow computational speed caused by the high complexity of spatiotemporal search in the collaborative network model.At the same time,an intelligent supply-demand adaptation method driven by the quality of service(QoS)is established,in which the experiences of adaptation are shared among adaptation subdomains through deep reinforcement learning(DRL)powered by a transfer mechanism to improve the poor adaptation results caused by dynamic uncertainty.The results show that the model and the solution proposed in this paper can performcollaborative and intelligent supply-demand adaptation for themassive and dynamic resources in SCM through autonomous learning and can effectively performglobal supply-demand matching and optimal resource allocation.
基金supported by NSFC grant(Nos.12001026,12071019)supported by the National Science Fund for Distinguished Young Scholars grant(No.12025108)+1 种基金Beijing Natural Science Foundation(No.Z180002)NSFC grant(Nos.12021001,11688101).
文摘Graph sparsification is to approximate an arbitrary graph by a sparse graph and is useful in many applications,such as simplification of social networks,least squares problems,and numerical solution of symmetric positive definite linear systems.In this paper,inspired by the well-known sparse signal recovery algorithm called orthogonal matching pursuit(OMP),we introduce a deterministic,greedy edge selection algorithm,which is called the universal greedy approach(UGA)for the graph sparsification problem.For a general spectral sparsification problem,e.g.,the positive subset selection problem from a set of m vectors in R n,we propose a nonnegative UGA algorithm which needs O(mn^(2)+n^(3)/ϵ^(2))time to find a 1+ϵ/β/1-ϵ/β-spectral sparsifier with positive coefficients with sparsity at most[n/ϵ^(2)],where β is the ratio between the smallest length and largest length of the vectors.The convergence of the nonnegative UGA algorithm is established.For the graph sparsification problem,another UGA algorithm is proposed which can output a 1+O(ϵ)/1-O(ϵ)-spectral sparsifier with[n/ϵ^(2)]edges in O(m+n^(2)/ϵ^(2))time from a graph with m edges and n vertices under some mild assumptions.This is a linear time algorithm in terms of the number of edges that the community of graph sparsification is looking for.The best result in the literature to the knowledge of the authors is the existence of a deterministic algorithm which is almost linear,i.e.O(m^(1+o(1)))for some o(1)=O((log log(m))^(2/3)/log^(1/3)(m)).Finally,extensive experimental results,including applications to graph clustering and least squares regression,show the effectiveness of proposed approaches.
基金The work was supported by the National Key Research and Development Program of China under Grant No.2018AAA0102301the National Natural Science Foundation of China under Grant No.61925203.
文摘Scene-based recommendation has proven its usefulness in E-commerce,by recommending commodities based on a given scene.However,scenes are typically unknown in advance,which necessitates scene discovery for E-commerce.In this article,we study scene discovery for E-commerce systems.We first formalize a scene as a set of commodity cate-gories that occur simultaneously and frequently in real-world situations,and model an E-commerce platform as a heteroge-neous information network(HIN),whose nodes and links represent different types of objects and different types of rela-tionships between objects,respectively.We then formulate the scene mining problem for E-commerce as an unsupervised learning problem that finds the overlapping clusters of commodity categories in the HIN.To solve the problem,we pro-pose a non-negative matrix factorization based method SMEC(Scene Mining for E-Commerce),and theoretically prove its convergence.Using six real-world E-commerce datasets,we finally conduct an extensive experimental study to evaluate SMEC against 13 other methods,and show that SMEC consistently outperforms its competitors with regard to various evaluation measures.
文摘Rapidly identifying protein complexes is significant to elucidate the mechanisms of macromolecular interactions and to further investigate the overlapping clinical manifestations of diseases.To date,existing computational methods majorly focus on developing unsupervised graph clustering algorithms,sometimes in combination with prior biological insights,to detect protein complexes from protein-protein interaction(PPI)networks.However,the outputs of these methods are potentially structural or functional modules within PPI networks.These modules do not necessarily correspond to the actual protein complexes that are formed via spatiotemporal aggregation of subunits.In this study,we propose a computational framework that combines supervised learning and dense subgraphs discovery to predict protein complexes.The proposed framework consists of two steps.The first step reconstructs genome-scale protein co-complex networks via training a supervised learning model of l2-regularized logistic regression on experimentally derived co-complexed protein pairs;and the second step infers hierarchical and balanced clusters as complexes from the co-complex networks via effective but computationally intensive k-clique graph clustering method or efficient maximum modularity clustering(MMC)algorithm.Empirical studies of cross validation and independent test show that both steps achieve encouraging performance.The proposed framework is fundamentally novel and excels over existing methods in that the complexes inferred from protein co-complex networks are more biologically relevant than those inferred from PPI networks,providing a new avenue for identifying novel protein complexes.
基金supported by the National Natural Science Foundation of China under Grant Nos.61271346,61172098,and91335112the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No.20112302110040the Fundamental Research Funds for the Central Universities of China under Grant No.HIT.KISTP.201418
文摘Proteins usually bind together to form complexes, which play an important role in cellular activities. Many graph clustering methods have been proposed to identify protein complexes by finding dense regions in protein-protein interaction networks. We present a novel framework (CPL) that detects protein complexes by propagating labels through interactions in a network, in which labels denote complex identifiers. With proper propagation in CPL, proteins in the same complex will be assigned with the same labels. CPL does not make any strong assumptions about the topological structures of the complexes, as in previous methods. Tile CPL algorithm is tested on several publicly available yeast protein-protein interaction networks and compared with several state-of-the-art methods. The results suggest that CPL performs better than the existing methods. An analysis of the functional homogeneity based on a gene ontology analysis shows that the detected complexes of CPL are highly biologically relevant.
基金funded by the National Natural Science Foundation of China under Grant Nos. 60873245,60933005,60873243,60903139 and 60803123funded by the National High Technology Research and Development 863 Programof China under Grant No. 2010AA012503the Beijing Natural Science Foundation under Grant No. 4122077
文摘Graph clustering has been widely applied in exploring regularities emerging in relational data.Recently,the rapid development of network theory correlates graph clustering with the detection of community structure,a common and important topological characteristic of networks.Most existing methods investigate the community structure at a single topological scale.However,as shown by empirical studies,the community structure of real world networks often exhibits multiple topological descriptions,corresponding to the clustering at different resolutions.Furthermore,the detection of multiscale community structure is heavily affected by the heterogeneous distribution of node degree.It is very challenging to detect multiscale community structure in heterogeneous networks.In this paper,we propose a novel,unified framework for detecting community structure from the perspective of dimensionality reduction.Based on the framework,we first prove that the well-known Laplacian matrix for network partition and the widely-used modularity matrix for community detection are two kinds of covariance matrices used in dimensionality reduction.We then propose a novel method to detect communities at multiple topological scales within our framework.We further show that existing algorithms fail to deal with heterogeneous node degrees.We develop a novel method to handle heterogeneity of networks by introducing a rescaling transformation into the covariance matrices in our framework.Extensive tests on real world and artificial networks demonstrate that the proposed correlation matrices significantly outperform Laplacian and modularity matrices in terms of their ability to identify multiscale community structure in heterogeneous networks.
基金supported by the National Natural Science Foundation of China(Nos.61573299,61174140,61472127,and 61272395)the Social Science Foundation of Hunan Province(No.16ZDA07)+2 种基金China Postdoctoral Science Foundation(Nos.2013M540628and 2014T70767)the Natural Science Foundation of Hunan Province(Nos.14JJ3107 and 2017JJ5064)the Excellent Youth Scholars Project of Hunan Province(No.15B087)
文摘The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.