The structure and dynamic nature of real-world networks can be revealed by communities that help in promotion of recommendation systems.Social Media platforms were initially developed for effective communication,but n...The structure and dynamic nature of real-world networks can be revealed by communities that help in promotion of recommendation systems.Social Media platforms were initially developed for effective communication,but now it is being used widely for extending and to obtain profit among business community.The numerous data generated through these platforms are utilized by many companies that make a huge profit out of it.A giant network of people in social media is grouped together based on their similar properties to form a community.Commu-nity detection is recent topic among the research community due to the increase usage of online social network.Community is one of a significant property of a net-work that may have many communities which have similarity among them.Community detection technique play a vital role to discover similarities among the nodes and keep them strongly connected.Similar nodes in a network are grouped together in a single community.Communities can be merged together to avoid lot of groups if there exist more edges between them.Machine Learning algorithms use community detection to identify groups with common properties and thus for recommen-dation systems,health care assistance systems and many more.Considering the above,this paper presents alternative method SimEdge-CD(Similarity and Edge between's based Community Detection)for community detection.The two stages of SimEdge-CD initiallyfind the similarity among nodes and group them into one community.During the second stage,it identifies the exact affiliations of boundary nodes using edge betweenness to create well defined communities.Evaluation of proposed method on synthetic and real datasets proved to achieve a better accuracy-efficiency trade-of compared to other existing methods.Our proposed SimEdge-CD achieves ideal value of 1 which is higher than existing sim closure like LPA,Attractor,Leiden and walktrap techniques.展开更多
Community detection is a fundamental work to analyse the structural and functional properties of complex networks. The label propagation algorithm (LPA) is a near linear time algorithm to find a good community struc...Community detection is a fundamental work to analyse the structural and functional properties of complex networks. The label propagation algorithm (LPA) is a near linear time algorithm to find a good community structure. Despite various subsequent advances, an important issue of this algorithm has not yet been properly addressed. Random update orders within the algorithm severely hamper the stability of the identified community structure. In this paper, we executed the basic label propagation algorithm on networks multiple times, to obtain a set of consensus partitions. Based on these consensus partitions, we created a consensus weighted graph. In this consensus weighted graph, the weight value of the edge was the proportion value that the number of node pairs allocated in the same cluster was divided by the total number of partitions. Then, we introduced consensus weight to indicate the direction of label propagation. In label update steps, by computing the mixing value of consensus weight and label frequency, a node adopted the label which has the maximum mixing value instead of the most frequent one. For extending to different networks, we introduced a proportion parameter to adjust the proportion of consensus weight and label frequency in computing mixing value. Finally, we proposed an approach named the label propagation algorithm with consensus weight (LPAcw), and the experimental results showed that the LPAcw could enhance considerably both the stability and the accuracy of community partitions.展开更多
Performing analytics on the load curve(LC)of customers is the foundation for demand response which requires a better understanding of customers'consumption pattern(CP)by analyzing the load curve.However,the perfor...Performing analytics on the load curve(LC)of customers is the foundation for demand response which requires a better understanding of customers'consumption pattern(CP)by analyzing the load curve.However,the performances of previous widely-used LC clustering methods are poor in two folds:larger number of clusters,huge variances within a cluster(a CP is extracted from a cluster),bringing huge difficulty to understand the electricity consumption pattern of customers.In this paper,to improve the performance of LC clustering,a clustering framework incorporated with community detection is proposed.The framework includes three parts:network construction,community detection,and CP extraction.According to the cluster validity index(CVI),the integrated approach outperforms the previous state-of-the-art method with the same amount of clusters.And the approach needs fewer clusters to achieve the same performance measured by CVI.展开更多
This paper studies the evolutionary prisoner's dilemma game on a highly clustered community network in which the clustering coefficient and the community size can be tuned. It finds that the clustering coefficient in...This paper studies the evolutionary prisoner's dilemma game on a highly clustered community network in which the clustering coefficient and the community size can be tuned. It finds that the clustering coefficient in such a degree-homogeneous network inhibits the emergence of cooperation for the entire range of the payoff parameter. Moreover, it finds that the community size can also have a marked influence on the evolution of cooperation, with a larger community size leading to not only a lower cooperation level but also a smaller threshold of the payoff parameter above which cooperators become extinct.展开更多
Recommender system (RS) has become a very important factor in many eCommerce sites. In our daily life, we rely on the recommendation from other persons either by word of mouth, recommendation letters, movie, item and ...Recommender system (RS) has become a very important factor in many eCommerce sites. In our daily life, we rely on the recommendation from other persons either by word of mouth, recommendation letters, movie, item and book reviews printed in newspapers, etc. The typical Recommender Systems are software tools and techniques that provide support to people by identifying interesting products and services in online store. It also provides a recommendation for certain users who search for the recommendations. The most important open challenge in Collaborative filtering recommender system is the cold start problem. If the adequate or sufficient information is not available for a new item or users, the recommender system runs into the cold start problem. To increase the usefulness of collaborative recommender systems, it could be desirable to eliminate the challenge such as cold start problem. Revealing the community structures is crucial to understand and more important with the increasing popularity of online social networks. The community detection is a key issue in social network analysis in which nodes of the communities are tightly connected each other and loosely connected between other communities. Many algorithms like Givan-Newman algorithm, modularity maximization, leading eigenvector, walk trap, etc., are used to detect the communities in the networks. To test the community division is meaningful we define a quality function called modularity. Modularity is that the links within a community are higher than the expected links in those communities. In this paper, we try to give a solution to the cold-start problem based on community detection algorithm that extracts the community from the social networks and identifies the similar users on that network. Hence, within the proposed work several intrinsic details are taken as a rule of thumb to boost the results higher. Moreover, the simulation experiment was taken to solve the cold start problem.展开更多
Considering the deviation in content of community detection resulting from the tow accuracy of resource relevance, an algorithm based on the topology of sites and the similarity between their topics is proposed. With ...Considering the deviation in content of community detection resulting from the tow accuracy of resource relevance, an algorithm based on the topology of sites and the similarity between their topics is proposed. With topic content factors fully considered, this algorithm can search for topically similar site clusters on the premise of inter-site topology. The experimental results show that the algorithm can generate a more accurate result of detection in the real network.展开更多
Random walks are a standard tool for modeling the spreading process in social and biological systems But in the face of large-scale networks, to achieve convergence, iterative calculation of the transition matrix in r...Random walks are a standard tool for modeling the spreading process in social and biological systems But in the face of large-scale networks, to achieve convergence, iterative calculation of the transition matrix in random walk methods consumes a lot of time. In this paper, we propose a three-stage hierarchical community detection algorithm based on Partial Matrix Approximation Convergence (PMAC) using random walks. First, this algorithm identifies the initial core nodes in a network by classical measurement and then utilizes the error function of the partial transition matrix convergence of the core nodes to determine the number of random walks steps. As such, the PMAC of the core nodes replaces the final convergence of all the nodes in the whole matrix. Finally based on the approximation convergence transition matrix, we cluster the communities around core nodes and use a closeness index to merge two communities. By recursively repeating the process, a dendrogram of the communities is eventually constructed. We validated the performance of the PMAC by comparing its results with those of two representative methods for three real-world networks with different scales展开更多
Graph clustering,i.e.,partitioning nodes or data points into non-overlapping clusters,can be beneficial in a large varieties of computer vision and machine learning applications.However,main graph clustering schemes,s...Graph clustering,i.e.,partitioning nodes or data points into non-overlapping clusters,can be beneficial in a large varieties of computer vision and machine learning applications.However,main graph clustering schemes,such as spectral clustering,cannot be applied to a large network due to prohibitive computational complexity required.While there exist methods applicable to large networks,these methods do not offer convincing comparisons against known ground truth.For the first time,this work conducts clustering algorithm performance evaluations on large networks(consisting of one million nodes)with ground truth information.Ideas and concepts from game theory are applied towards graph clustering to formulate a new proposed algorithm,Game Theoretical Approach for Clustering(GTAC).This theoretical framework is shown to be a generalization of both the Label Propagation and Louvain methods,offering an additional means of derivation and analysis.GTAC introduces a tuning parameter which allows variable algorithm performance in accordance with application needs.Experimentation shows that these GTAC algorithms offer scalability and tunability towards big data applications.展开更多
The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale netwo...The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.展开更多
In social tagging systems such as Delicious and Flickr,users collaboratively manage tags to annotate resources.Naturally,a social tagging system can be modeled as a (user,tag,resource) hypernetwork,where there are t...In social tagging systems such as Delicious and Flickr,users collaboratively manage tags to annotate resources.Naturally,a social tagging system can be modeled as a (user,tag,resource) hypernetwork,where there are three different types of nodes,namely users,resources and tags,and each hyperedge has three end nodes,connecting a user,a resource and a tag that the user employs to annotate the resource.Then how can we automatically cluster related users,resources and tags,respectively? This is a problem of community detection in a 3-partite,3-uniform hypernetwork.More generally,given a K-partite K-uniform (hyper)network,where each (hyper)edge is a K-tuple composed of nodes of K different types,how can we automatically detect communities for nodes of different types? In this paper,by turning this problem into a problem of finding an efficient compression of the (hyper)network's structure,we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities,and develop a fast community detection method based on optimization.Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive,parameter-free,and scalable.We compare our method with existing methods in both synthetic and real-world datasets.展开更多
文摘The structure and dynamic nature of real-world networks can be revealed by communities that help in promotion of recommendation systems.Social Media platforms were initially developed for effective communication,but now it is being used widely for extending and to obtain profit among business community.The numerous data generated through these platforms are utilized by many companies that make a huge profit out of it.A giant network of people in social media is grouped together based on their similar properties to form a community.Commu-nity detection is recent topic among the research community due to the increase usage of online social network.Community is one of a significant property of a net-work that may have many communities which have similarity among them.Community detection technique play a vital role to discover similarities among the nodes and keep them strongly connected.Similar nodes in a network are grouped together in a single community.Communities can be merged together to avoid lot of groups if there exist more edges between them.Machine Learning algorithms use community detection to identify groups with common properties and thus for recommen-dation systems,health care assistance systems and many more.Considering the above,this paper presents alternative method SimEdge-CD(Similarity and Edge between's based Community Detection)for community detection.The two stages of SimEdge-CD initiallyfind the similarity among nodes and group them into one community.During the second stage,it identifies the exact affiliations of boundary nodes using edge betweenness to create well defined communities.Evaluation of proposed method on synthetic and real datasets proved to achieve a better accuracy-efficiency trade-of compared to other existing methods.Our proposed SimEdge-CD achieves ideal value of 1 which is higher than existing sim closure like LPA,Attractor,Leiden and walktrap techniques.
基金supported by the National Natural Science Foundation of China(Grant No.61370073)the China Scholarship Council,China(Grant No.201306070037)
文摘Community detection is a fundamental work to analyse the structural and functional properties of complex networks. The label propagation algorithm (LPA) is a near linear time algorithm to find a good community structure. Despite various subsequent advances, an important issue of this algorithm has not yet been properly addressed. Random update orders within the algorithm severely hamper the stability of the identified community structure. In this paper, we executed the basic label propagation algorithm on networks multiple times, to obtain a set of consensus partitions. Based on these consensus partitions, we created a consensus weighted graph. In this consensus weighted graph, the weight value of the edge was the proportion value that the number of node pairs allocated in the same cluster was divided by the total number of partitions. Then, we introduced consensus weight to indicate the direction of label propagation. In label update steps, by computing the mixing value of consensus weight and label frequency, a node adopted the label which has the maximum mixing value instead of the most frequent one. For extending to different networks, we introduced a proportion parameter to adjust the proportion of consensus weight and label frequency in computing mixing value. Finally, we proposed an approach named the label propagation algorithm with consensus weight (LPAcw), and the experimental results showed that the LPAcw could enhance considerably both the stability and the accuracy of community partitions.
基金Supported by the Major Program of National Natural Science Foundation of China(No.61432006)。
文摘Performing analytics on the load curve(LC)of customers is the foundation for demand response which requires a better understanding of customers'consumption pattern(CP)by analyzing the load curve.However,the performances of previous widely-used LC clustering methods are poor in two folds:larger number of clusters,huge variances within a cluster(a CP is extracted from a cluster),bringing huge difficulty to understand the electricity consumption pattern of customers.In this paper,to improve the performance of LC clustering,a clustering framework incorporated with community detection is proposed.The framework includes three parts:network construction,community detection,and CP extraction.According to the cluster validity index(CVI),the integrated approach outperforms the previous state-of-the-art method with the same amount of clusters.And the approach needs fewer clusters to achieve the same performance measured by CVI.
基金Project supported by the National Natural Science Foundation of China (Grant Nos 70671079, 60674050, 60736022 and 60528007)National 973 Program (Grant No 2002CB312200)+1 种基金National 863 Program (Grant No 2006AA04Z258)11-5 project (Grant NoA2120061303)
文摘This paper studies the evolutionary prisoner's dilemma game on a highly clustered community network in which the clustering coefficient and the community size can be tuned. It finds that the clustering coefficient in such a degree-homogeneous network inhibits the emergence of cooperation for the entire range of the payoff parameter. Moreover, it finds that the community size can also have a marked influence on the evolution of cooperation, with a larger community size leading to not only a lower cooperation level but also a smaller threshold of the payoff parameter above which cooperators become extinct.
文摘Recommender system (RS) has become a very important factor in many eCommerce sites. In our daily life, we rely on the recommendation from other persons either by word of mouth, recommendation letters, movie, item and book reviews printed in newspapers, etc. The typical Recommender Systems are software tools and techniques that provide support to people by identifying interesting products and services in online store. It also provides a recommendation for certain users who search for the recommendations. The most important open challenge in Collaborative filtering recommender system is the cold start problem. If the adequate or sufficient information is not available for a new item or users, the recommender system runs into the cold start problem. To increase the usefulness of collaborative recommender systems, it could be desirable to eliminate the challenge such as cold start problem. Revealing the community structures is crucial to understand and more important with the increasing popularity of online social networks. The community detection is a key issue in social network analysis in which nodes of the communities are tightly connected each other and loosely connected between other communities. Many algorithms like Givan-Newman algorithm, modularity maximization, leading eigenvector, walk trap, etc., are used to detect the communities in the networks. To test the community division is meaningful we define a quality function called modularity. Modularity is that the links within a community are higher than the expected links in those communities. In this paper, we try to give a solution to the cold-start problem based on community detection algorithm that extracts the community from the social networks and identifies the similar users on that network. Hence, within the proposed work several intrinsic details are taken as a rule of thumb to boost the results higher. Moreover, the simulation experiment was taken to solve the cold start problem.
基金Supported by the National Science and Technology Support Program of China(No.2012BAH45B01)the National Natural Science Foundation of China(No.61100189,61370215,61370211,61402137)the National“242”Project of China(No.2016A104)
文摘Considering the deviation in content of community detection resulting from the tow accuracy of resource relevance, an algorithm based on the topology of sites and the similarity between their topics is proposed. With topic content factors fully considered, this algorithm can search for topically similar site clusters on the premise of inter-site topology. The experimental results show that the algorithm can generate a more accurate result of detection in the real network.
基金supported by the National Natural Science Foundation of China(Nos.61272422,61572260,61373017,and 61572261)
文摘Random walks are a standard tool for modeling the spreading process in social and biological systems But in the face of large-scale networks, to achieve convergence, iterative calculation of the transition matrix in random walk methods consumes a lot of time. In this paper, we propose a three-stage hierarchical community detection algorithm based on Partial Matrix Approximation Convergence (PMAC) using random walks. First, this algorithm identifies the initial core nodes in a network by classical measurement and then utilizes the error function of the partial transition matrix convergence of the core nodes to determine the number of random walks steps. As such, the PMAC of the core nodes replaces the final convergence of all the nodes in the whole matrix. Finally based on the approximation convergence transition matrix, we cluster the communities around core nodes and use a closeness index to merge two communities. By recursively repeating the process, a dendrogram of the communities is eventually constructed. We validated the performance of the PMAC by comparing its results with those of two representative methods for three real-world networks with different scales
文摘Graph clustering,i.e.,partitioning nodes or data points into non-overlapping clusters,can be beneficial in a large varieties of computer vision and machine learning applications.However,main graph clustering schemes,such as spectral clustering,cannot be applied to a large network due to prohibitive computational complexity required.While there exist methods applicable to large networks,these methods do not offer convincing comparisons against known ground truth.For the first time,this work conducts clustering algorithm performance evaluations on large networks(consisting of one million nodes)with ground truth information.Ideas and concepts from game theory are applied towards graph clustering to formulate a new proposed algorithm,Game Theoretical Approach for Clustering(GTAC).This theoretical framework is shown to be a generalization of both the Label Propagation and Louvain methods,offering an additional means of derivation and analysis.GTAC introduces a tuning parameter which allows variable algorithm performance in accordance with application needs.Experimentation shows that these GTAC algorithms offer scalability and tunability towards big data applications.
基金supported by the National Natural Science Foundation of China(Nos.61573299,61174140,61472127,and 61272395)the Social Science Foundation of Hunan Province(No.16ZDA07)+2 种基金China Postdoctoral Science Foundation(Nos.2013M540628and 2014T70767)the Natural Science Foundation of Hunan Province(Nos.14JJ3107 and 2017JJ5064)the Excellent Youth Scholars Project of Hunan Province(No.15B087)
文摘The distance dynamics model is excellent tool for uncovering the community structure of a complex network. However, one issue that must be addressed by this model is its very long computation time in large-scale networks. To identify the community structure of a large-scale network with high speed and high quality, in this paper, we propose a fast community detection algorithm, the F-Attractor, which is based on the distance dynamics model. The main contributions of the F-Attractor are as follows. First, we propose the use of two prejudgment rules from two different perspectives: node and edge. Based on these two rules, we develop a strategy of internal edge prejudgment for predicting the internal edges of the network. Internal edge prejudgment can reduce the number of edges and their neighbors that participate in the distance dynamics model. Second, we introduce a triangle distance to further enhance the speed of the interaction process in the distance dynamics model. This triangle distance uses two known distances to measure a third distance without any extra computation. We combine the above techniques to improve the distance dynamics model and then describe the community detection process of the F-Attractor. The results of an extensive series of experiments demonstrate that the F-Attractor offers high-speed community detection and high partition quality.
基金supported in part by JSPS Grant-in-Aid under Grant No.22300049 and IBM Ph.D.Fellowship
文摘In social tagging systems such as Delicious and Flickr,users collaboratively manage tags to annotate resources.Naturally,a social tagging system can be modeled as a (user,tag,resource) hypernetwork,where there are three different types of nodes,namely users,resources and tags,and each hyperedge has three end nodes,connecting a user,a resource and a tag that the user employs to annotate the resource.Then how can we automatically cluster related users,resources and tags,respectively? This is a problem of community detection in a 3-partite,3-uniform hypernetwork.More generally,given a K-partite K-uniform (hyper)network,where each (hyper)edge is a K-tuple composed of nodes of K different types,how can we automatically detect communities for nodes of different types? In this paper,by turning this problem into a problem of finding an efficient compression of the (hyper)network's structure,we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities,and develop a fast community detection method based on optimization.Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive,parameter-free,and scalable.We compare our method with existing methods in both synthetic and real-world datasets.