Fast identifying the amount of information that can be gained by measuring a network via shortest-paths is one of the fundamental problem for networks exploration and monitoring.However,the existing methods are time-c...Fast identifying the amount of information that can be gained by measuring a network via shortest-paths is one of the fundamental problem for networks exploration and monitoring.However,the existing methods are time-consuming for even moderate-scale networks.In this paper,we present a method for fast shortest-path cover identification in both exact and approximate scenarios based on the relationship between the identification and the shortest distance queries.The effectiveness of the proposed method is validated through synthetic and real-world networks.The experimental results show that our method is 105 times faster than the existing methods and can solve the shortest-path cover identification in a few seconds for large-scale networks with millions of nodes and edges.展开更多
The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks....The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.展开更多
Truth discovery aims to resolve conflicts among multiple sources and find the truth. Conventional methods for truth discovery mainly investigate the mutual effect between the reliability of sources and the credibility...Truth discovery aims to resolve conflicts among multiple sources and find the truth. Conventional methods for truth discovery mainly investigate the mutual effect between the reliability of sources and the credibility of statements. These methods use real numbers, which have a lower representation capability than vectors to represent the reliability. In addition, neural networks have not been used for truth discovery. In this work, we propose memory-network-based models to address truth discovery. Our proposed models use feedforward and feedback memory networks to learn the representation of the credibility of statements. Specifically, our models adopt a memory mechanism to learn the reliability of sources for truth prediction. The proposed models use categorical and continuous data during model learning by automatically assigning different weights to the loss function on the basis of their own effects. Experimental results show that our proposed models outperform state-of-the-art methods for truth discovery.展开更多
The traffic congestion occurs frequently in urban areas, while most existing solutions only take effects after congesting. In this paper, a congestion warning method is proposed based on the Internet of vehicles(IOV...The traffic congestion occurs frequently in urban areas, while most existing solutions only take effects after congesting. In this paper, a congestion warning method is proposed based on the Internet of vehicles(IOV) and community discovery of complex networks. The communities in complex network model of traffic flow reflect the local aggregation of vehicles in the traffic system, and it is used to predict the upcoming congestion. The real-time information of vehicles on the roads is obtained from the IOV, which includes the locations, speeds and orientations of vehicles. Then the vehicles are mapped into nodes of network, the links between nodes are determined by the correlations between vehicles in terms of location and speed. The complex network model of traffic flow is hereby established. The communities in this complex network are discovered by fast Newman(FN) algorithm, and the congestion warnings are generated according to the communities selected by scale and density. This method can detect the tendency of traffic aggregation and provide warnings before congestion occurs. The simulations show that the method proposed in this paper is effective and practicable, and makes it possible to take action before traffic congestion.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
Background: The frequency of small subtrees in biological, social, and other types of networks could shed light into the structure, function, and evolution of such networks. However, counting all possible subtrees of...Background: The frequency of small subtrees in biological, social, and other types of networks could shed light into the structure, function, and evolution of such networks. However, counting all possible subtrees of a prescribed size can be computationally expensive because of their potentially large number even in small, sparse networks. Moreover, most of the existing algorithms for subtree counting belong to the subtree-centric approaches, which search for a specific single subtree type at a time, potentially taking more time by searching again on the same network. Methods: In this paper, we propose a network-centric algorithm (MTMO) to efficiently count k-size subtrees. Our algorithm is based on the enumeration of all connected sets of k-1 edges, incorporates a labeled rooted tree data structure in the enumeration process to reduce the number of isomorphism tests required, and uses an array-based indexing scheme to simplify the subtree counting method. Results: The experiments on three representative undirected complex networks show that our algorithm is roughly an order of magnitude faster than existing subtree-centric approaches and base network-centric algorithm which does not use rooted tree, allowing for counting larger subtrees in larger networks than previously possible. We also show major differences between unicellular and multicellular organisms. In addition, our algorithm is applied to find network motifs based on pattern growth approach. Conclusions: A network-centric algorithm which allows for a This enables us to count larger motif in larger networks than faster counting of non-induced subtrees is proposed previously.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
基金This work was supported in part by the National Natural Science Foundation of China(61471101)the National Natural Science Foundation of China(U1736205).
文摘Fast identifying the amount of information that can be gained by measuring a network via shortest-paths is one of the fundamental problem for networks exploration and monitoring.However,the existing methods are time-consuming for even moderate-scale networks.In this paper,we present a method for fast shortest-path cover identification in both exact and approximate scenarios based on the relationship between the identification and the shortest distance queries.The effectiveness of the proposed method is validated through synthetic and real-world networks.The experimental results show that our method is 105 times faster than the existing methods and can solve the shortest-path cover identification in a few seconds for large-scale networks with millions of nodes and edges.
基金funded by the Enterprise Ireland Innovation Partnership Programme with Ericsson under grant agreement IP/2011/0135[6]supported by the National Natural Science Foundation of China(No.61373131,61303039,61232016,61501247)+1 种基金the PAPDCICAEET funds
文摘The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.
基金supported by the National HighTech Development(863)Program of China(No.2015AA015407)the National Natural Science Foundation of China(Nos.61632011 and 61370164)
文摘Truth discovery aims to resolve conflicts among multiple sources and find the truth. Conventional methods for truth discovery mainly investigate the mutual effect between the reliability of sources and the credibility of statements. These methods use real numbers, which have a lower representation capability than vectors to represent the reliability. In addition, neural networks have not been used for truth discovery. In this work, we propose memory-network-based models to address truth discovery. Our proposed models use feedforward and feedback memory networks to learn the representation of the credibility of statements. Specifically, our models adopt a memory mechanism to learn the reliability of sources for truth prediction. The proposed models use categorical and continuous data during model learning by automatically assigning different weights to the loss function on the basis of their own effects. Experimental results show that our proposed models outperform state-of-the-art methods for truth discovery.
基金supported by the National Natural Science Foundation of China(61433003,61273150)the Beijing Higher Education Young Elite Teacher Project(YETP1192)
文摘The traffic congestion occurs frequently in urban areas, while most existing solutions only take effects after congesting. In this paper, a congestion warning method is proposed based on the Internet of vehicles(IOV) and community discovery of complex networks. The communities in complex network model of traffic flow reflect the local aggregation of vehicles in the traffic system, and it is used to predict the upcoming congestion. The real-time information of vehicles on the roads is obtained from the IOV, which includes the locations, speeds and orientations of vehicles. Then the vehicles are mapped into nodes of network, the links between nodes are determined by the correlations between vehicles in terms of location and speed. The complex network model of traffic flow is hereby established. The communities in this complex network are discovered by fast Newman(FN) algorithm, and the congestion warnings are generated according to the communities selected by scale and density. This method can detect the tendency of traffic aggregation and provide warnings before congestion occurs. The simulations show that the method proposed in this paper is effective and practicable, and makes it possible to take action before traffic congestion.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.
基金This work was supported by the National Natural Science Foundation of China (No. 61572180) and Scientific and Technological Research Project of Education Department in Jiangxi Province (No. GJJ170383),
文摘Background: The frequency of small subtrees in biological, social, and other types of networks could shed light into the structure, function, and evolution of such networks. However, counting all possible subtrees of a prescribed size can be computationally expensive because of their potentially large number even in small, sparse networks. Moreover, most of the existing algorithms for subtree counting belong to the subtree-centric approaches, which search for a specific single subtree type at a time, potentially taking more time by searching again on the same network. Methods: In this paper, we propose a network-centric algorithm (MTMO) to efficiently count k-size subtrees. Our algorithm is based on the enumeration of all connected sets of k-1 edges, incorporates a labeled rooted tree data structure in the enumeration process to reduce the number of isomorphism tests required, and uses an array-based indexing scheme to simplify the subtree counting method. Results: The experiments on three representative undirected complex networks show that our algorithm is roughly an order of magnitude faster than existing subtree-centric approaches and base network-centric algorithm which does not use rooted tree, allowing for counting larger subtrees in larger networks than previously possible. We also show major differences between unicellular and multicellular organisms. In addition, our algorithm is applied to find network motifs based on pattern growth approach. Conclusions: A network-centric algorithm which allows for a This enables us to count larger motif in larger networks than faster counting of non-induced subtrees is proposed previously.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.