Visual fire detection technologies can detect fire and alarm warnings earlier than conventional fire detectors. This study proposes an effective visual fire detection method that combines the statistical fire color mo...Visual fire detection technologies can detect fire and alarm warnings earlier than conventional fire detectors. This study proposes an effective visual fire detection method that combines the statistical fire color model and sequential pattern mining technology to detect fire in an image. Furthermore, the proposed method also supports real-time fire detection by integrating adaptive background subtraction technologies. Experimental results show that the proposed method can effectively detect fire in test images and videos. The detection accuracy of the proposed hybrid method is better than that of Celik's method.展开更多
In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interacti...In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining展开更多
Sequential pattern mining is an important data mining problem with broadapplications. However, it is also a challenging problem since the mining may have to generate orexamine a combinatorially explosive number of int...Sequential pattern mining is an important data mining problem with broadapplications. However, it is also a challenging problem since the mining may have to generate orexamine a combinatorially explosive number of intermediate subsequences. Recent studies havedeveloped two major classes of sequential pattern mining methods: (1) a candidategeneration-and-test approach, represented by (ⅰ) GSP, a horizontal format-based sequential patternmining method, and (ⅱ) SPADE, a vertical format-based method; and (2) a pattern-growth method,represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns.In this study, we perform a systematic introduction and presentation of the pattern-growthmethodology and study its principles and extensions. We first introduce two interestingpattern-growth algorithms, FreeSpan and PrefixSpan, for efficient sequential pattern mining. Then weintroduce gSpan for mining structured patterns using the same methodology. Their relativeperformance in large databases is presented and analyzed. Several extensions of these methods arealso discussed in the paper, including mining multi-level, multi-dimensional patterns and miningconstraint-based patterns.展开更多
Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-...Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-temporal data by a novel method adopting the concepts of clustering and sequential pattern mining. The algorithms used logically split the trajectory span area into clusters and then apply the k-means algorithm over this clusters until the squared error minimizes. The new method applies the threshold to obtain active clusters and arranges them in descending order based on number of trajectories passing through. From these active clusters, inter cluster patterns are found by a sequential pattern mining technique. The process is repeated until all the active clusters are linked. The clusters thus linked in sequence are the frequent trajectories. A set of experiments conducted using real datasets shows that the proposed method is relatively five times better than the existing ones. A comparison is made with the results of other algorithms and their variation is analyzed by statistical methods. Further, tests of significance are conducted with ANOVA to find the efficient threshold value for the optimum plot of frequent trajectories. The results are analyzed and found to be superior than the existing ones. This approach may be of relevance in finding alternate paths in busy networks ( congestion control), finding the frequent paths of migratory birds, or even to predict the next level of pattern characteristics in case of time series data with minor alterations and finding the frequent path of balls in certain games.展开更多
Mining sequential patterns from large databases has been recognized by many researchers as an attractive task of data mining and knowledge dis- covery. Previous algorithms scan the databases for many times, which is ...Mining sequential patterns from large databases has been recognized by many researchers as an attractive task of data mining and knowledge dis- covery. Previous algorithms scan the databases for many times, which is often unendurable due to the very large amount of databases. In this paper, the authors introduce an effective algorithm for mining sequential patterns from large databases. In the algorithm, the original database is not used at all for counting the support of sequences after the first pass. Rather, a tidlist structure generated in the Previous pass is employed for the purpose based on set intersection operations, avoiding the multiple scans of the databases.展开更多
Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns ...Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns with the correlation. According to the requirement of real applications, the needed data analysis should be different. In previous mining approaches, after mining the sequential patterns, sequential patterns with the weak affinity are found even with a high minimum support. In this paper, a new framework is suggested for mining weighted support affinity patterns in which an objective measure, sequential ws-confidence is developed to detect correlated sequential patterns with weighted support affinity patterns. To efficiently prune the weak affinity patterns, it is proved that ws-confidence measure satisfies the anti-monotone and cross weighted support properties which can be applied to eliminate sequential patterns with dissimilar weighted support levels. Based on the framework, a weighted support affinity pattern mining algorithm (WSMiner) is suggested. The performance study shows that WSMiner is efficient and scalable for mining weighted support affinity patterns.展开更多
Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimen...Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimensional sequential pattern mining(MDSPM).This study is illustrated with a time series of 24 years of European Centre for Medium-Range Weather Forecasts European Reanalysis-Interim gridded(0.125°×0.125°)wind data for the Netherlands every 6 h and at six height levels.The wind data were first transformed into two spatio-temporal sequence databases(for speed and direction,respectively).Then,the Linear time Closed Itemset Miner Sequence algorithm was used to extract the multidimensional sequential patterns,which were then visualized using a 3D wind rose,a circular histogram and a geographical map.These patterns were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines.Our analysis identified four frequent wind profile patterns.One of them highly suitable to harvest wind energy at a height of 128 m and 68.97%of the geographical area covered by this pattern already contains wind turbines.This study shows that the proposed approach is capable of efficiently extracting meaningful patterns from complex spatio-temporal datasets.展开更多
The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks....The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.展开更多
This paper expounds the basic principles and structures of the whole petroleum system to reveal the pattern of conventional oil/gas-tight oil/gas-shale oil/gas sequential accumulation and the hydrocarbon accumulation ...This paper expounds the basic principles and structures of the whole petroleum system to reveal the pattern of conventional oil/gas-tight oil/gas-shale oil/gas sequential accumulation and the hydrocarbon accumulation models and mechanisms of the whole petroleum system.It delineates the geological model,flow model,and production mechanism of shale and tight reservoirs,and proposes future research orientations.The main structure of the whole petroleum system includes three fluid dynamic fields,three types of oil and gas reservoirs/resources,and two types of reservoir-forming processes.Conventional oil/gas,tight oil/gas,and shale oil/gas are orderly in generation time and spatial distribution,and sequentially rational in genetic mechanism,showing the pattern of sequential accumulation.The whole petroleum system involves two categories of hydrocarbon accumulation models:hydrocarbon accumulation in the detrital basin and hydrocarbon accumulation in the carbonate basin/formation.The accumulation of unconventional oil/gas is self-containment,which is microscopically driven by the intermolecular force(van der Waals force).The unconventional oil/gas production has proved that the geological model,flow model,and production mechanism of shale and tight reservoirs represent a new and complex field that needs further study.Shale oil/gas must be the most important resource replacement for oil and gas resources of China.Future research efforts include:(1)the characteristics of the whole petroleum system in carbonate basins and the source-reservoir coupling patterns in the evolution of composite basins;(2)flow mechanisms in migration,accumulation,and production of shale oil/gas and tight oil/gas;(3)geological characteristics and enrichment of deep and ultra-deep shale oil/gas,tight oil/gas and coalbed methane;(4)resource evaluation and new generation of basin simulation technology of the whole petroleum system;(5)research on earth system-earth organic rock and fossil fuel system-whole petroleum system.展开更多
Pedestrian group detection is a challenging but significant issue in pedestrian flow control and public safety management.To address the issue that most conventional pedestrian grouping models(PGMs)can only identify a...Pedestrian group detection is a challenging but significant issue in pedestrian flow control and public safety management.To address the issue that most conventional pedestrian grouping models(PGMs)can only identify a pedestrian group at a limited distance of less than 2 m,this study extended the pedestrian distance constraint of conventional PGMs with a reconstruction of the normal group detection criterion and development of a novelgroup detection criterion suitable for long-span space.To measure the movement behaviorsimilarity with normal distance,five necessary constraints:velocity difference,moving direction offset,distance limitation,distance fluctuation,and group-keeping duration were studied quantitatively to form the criterion to detect normal groups.Meanwhile,a long-span group detection criterion was proposed with extended distance and direction con-sistency constraints.Therefore,this study proposed an improved PGM that considers long-span spaces(PGMLS).In the PGMLS workflow,the MMTrack algorithm was used to obtainpedestrian trajectories.A difference measurement method based on sequential pattern analysis(SPA)was adopted to analyze the velocity similarity of pedestrians.To validate the proposed grouping model,experiments based on pedestrian movement videos in the exit hall of the Shanghai Hongqiao International Airport were conducted.The results indicate that the proposed model can detect both normal and widely separated pedestrian groups,with a long span range of 2-12 m.展开更多
Recommender systems as one of the most efficient information filtering techniques have been widely studied in recent years. However, traditional recommender systems only utilize user-item rating matrix for recommendat...Recommender systems as one of the most efficient information filtering techniques have been widely studied in recent years. However, traditional recommender systems only utilize user-item rating matrix for recommendations, and the social connections and item sequential patterns are ignored. But in our real life, we always turn to our friends for recommendations, and often select the items that have similar sequential patterns. In order to overcome these challenges, many studies have taken social connections and sequential information into account to enhance recommender systems. Although these existing studies have achieved good results, most of them regard social influence and sequential information as regularization terms, and the deep structure hidden in social networks and rating patterns has not been fully explored. On the other hand, neural network based embedding methods have shown their power in many recommendation tasks with their ability to extract high-level representations from raw data. Motivated by the above observations, we take the advantage of network embedding techniques and propose an embedding-based recommendation method, which is composed of the embedding model and the collaborative filtering model. Specifically, to exploit the deep structure hidden in social networks and rating patterns, a neural network based embedding model is first pre-trained, where the external user and item representations are extracted. Then, we incorporate these extracted factors into a collaborative filtering model by fusing them with latent factors linearly, where our method not only can leverage the external information to enhance recommendation, but also can exploit the advantage of collaborative filtering techniques. Experimental results on two real-world datasets demonstrate the effectiveness of our proposed method and the importance of these external extracted factors.展开更多
An enhanced cascading failure model integrating data mining technique is proposed in this paper.In order to better simulate the process of cascading failure propagation and further analyze the relationship between fai...An enhanced cascading failure model integrating data mining technique is proposed in this paper.In order to better simulate the process of cascading failure propagation and further analyze the relationship between failure chains,in view of a basic framework of cascading failure described in this paper,some significant improvements in emerging prevention and control measures,the subsequent failure search strategy as well as the statistical analysis for the failure chains are made elaborately.Especially,a sequential pattern mining model is employed to find out the association pertinent to the obtained failure chains.In addition,a cluster analysis model is applied to evaluate the relationship between the intermediate data and the consequence of obtained failure chain,which can provide the prediction in potential propagation path of cascading failure to reduce the risk of catastrophic events.Finally,the case studies are conducted on the IEEE 10-machine-39-bus test system as benchmark to demonstrate the validity and effectiveness of the proposed enhanced cascading failure model.Some preliminary concluding remarks and comments are drawn.展开更多
In the era of global Internet security threats,there is an urgent need for different organizations to cooperate and jointly fight against cyber attacks.We present an algorithm that combines a privacy-preserving techni...In the era of global Internet security threats,there is an urgent need for different organizations to cooperate and jointly fight against cyber attacks.We present an algorithm that combines a privacy-preserving technique and a multi-step attack-correlation method to better balance the privacy and availability of alarm data.This algorithm is used to construct multi-step attack scenarios by discovering sequential attack-behavior patterns.It analyzes the time-sequential characteristics of attack behaviors and implements a support-evaluation method.Optimized candidate attack-sequence generation is applied to solve the problem of pre-defined association-rule complexity,as well as expert-knowledge dependency.An enhanced k-anonymity method is applied to this algorithm to preserve privacy.Experimental results indicate that the algorithm has better performance and accuracy for multi-step attack correlation than other methods,and reaches a good balance between efficiency and privacy.展开更多
From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-c...From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-consuming and the large number of patterns discovered also makes the pattern selection and classifier building very time-consuming. The fact is that, in sequence classification, it is much more important to discover discriminative patterns than a complete pattern set. In this paper, we propose a novel hierarchical algorithm to build sequential classifiers using discriminative sequential patterns. Firstly, we mine for the sequential patterns which axe the most strongly correlated to each target class. In this step, an aggressive strategy is employed to select a small set of sequential patterns. Secondly, pattern pruning and serial coverage test are done on the mined patterns. The patterns that pass the serial test are used to build the sub-classifier at the first level of the final classifier. And thirdly, the training samples that cannot be covered are fed back to the sequential pattern mining stage with updated parameters. This process continues until predefined interestingness measure thresholds are reached, or all samples axe covered. The patterns generated in each loop form the sub-classifier at each level of the final classifier. Within this framework, the searching space can be reduced dramatically while a good classification performance is achieved. The proposed algorithm is tested in a real-world business application for debt prevention in social security area. The novel sequence classification algorithm shows the effectiveness and efficiency for predicting debt occurrences based on customer activity sequence data.展开更多
基金supported by National Science Council under Grant No. NSC98-2221-E-218-046
文摘Visual fire detection technologies can detect fire and alarm warnings earlier than conventional fire detectors. This study proposes an effective visual fire detection method that combines the statistical fire color model and sequential pattern mining technology to detect fire in an image. Furthermore, the proposed method also supports real-time fire detection by integrating adaptive background subtraction technologies. Experimental results show that the proposed method can effectively detect fire in test images and videos. The detection accuracy of the proposed hybrid method is better than that of Celik's method.
基金Supported by the National Natural Science Funda-tion of China (70371015) andthe Natural Science Foundation of Jian-gsu Province (BK2004058)
文摘In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining
文摘Sequential pattern mining is an important data mining problem with broadapplications. However, it is also a challenging problem since the mining may have to generate orexamine a combinatorially explosive number of intermediate subsequences. Recent studies havedeveloped two major classes of sequential pattern mining methods: (1) a candidategeneration-and-test approach, represented by (ⅰ) GSP, a horizontal format-based sequential patternmining method, and (ⅱ) SPADE, a vertical format-based method; and (2) a pattern-growth method,represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns.In this study, we perform a systematic introduction and presentation of the pattern-growthmethodology and study its principles and extensions. We first introduce two interestingpattern-growth algorithms, FreeSpan and PrefixSpan, for efficient sequential pattern mining. Then weintroduce gSpan for mining structured patterns using the same methodology. Their relativeperformance in large databases is presented and analyzed. Several extensions of these methods arealso discussed in the paper, including mining multi-level, multi-dimensional patterns and miningconstraint-based patterns.
基金the receipt of research supported by the TATA Consultancy Service's scholarship
文摘Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-temporal data by a novel method adopting the concepts of clustering and sequential pattern mining. The algorithms used logically split the trajectory span area into clusters and then apply the k-means algorithm over this clusters until the squared error minimizes. The new method applies the threshold to obtain active clusters and arranges them in descending order based on number of trajectories passing through. From these active clusters, inter cluster patterns are found by a sequential pattern mining technique. The process is repeated until all the active clusters are linked. The clusters thus linked in sequence are the frequent trajectories. A set of experiments conducted using real datasets shows that the proposed method is relatively five times better than the existing ones. A comparison is made with the results of other algorithms and their variation is analyzed by statistical methods. Further, tests of significance are conducted with ANOVA to find the efficient threshold value for the optimum plot of frequent trajectories. The results are analyzed and found to be superior than the existing ones. This approach may be of relevance in finding alternate paths in busy networks ( congestion control), finding the frequent paths of migratory birds, or even to predict the next level of pattern characteristics in case of time series data with minor alterations and finding the frequent path of balls in certain games.
文摘Mining sequential patterns from large databases has been recognized by many researchers as an attractive task of data mining and knowledge dis- covery. Previous algorithms scan the databases for many times, which is often unendurable due to the very large amount of databases. In this paper, the authors introduce an effective algorithm for mining sequential patterns from large databases. In the algorithm, the original database is not used at all for counting the support of sequences after the first pass. Rather, a tidlist structure generated in the Previous pass is employed for the purpose based on set intersection operations, avoiding the multiple scans of the databases.
文摘Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns with the correlation. According to the requirement of real applications, the needed data analysis should be different. In previous mining approaches, after mining the sequential patterns, sequential patterns with the weak affinity are found even with a high minimum support. In this paper, a new framework is suggested for mining weighted support affinity patterns in which an objective measure, sequential ws-confidence is developed to detect correlated sequential patterns with weighted support affinity patterns. To efficiently prune the weak affinity patterns, it is proved that ws-confidence measure satisfies the anti-monotone and cross weighted support properties which can be applied to eliminate sequential patterns with dissimilar weighted support levels. Based on the framework, a weighted support affinity pattern mining algorithm (WSMiner) is suggested. The performance study shows that WSMiner is efficient and scalable for mining weighted support affinity patterns.
基金This work was supported by the Malaysian Ministry of Education(SLAI)and Universiti Teknologi Malaysia(UTM).
文摘Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimensional sequential pattern mining(MDSPM).This study is illustrated with a time series of 24 years of European Centre for Medium-Range Weather Forecasts European Reanalysis-Interim gridded(0.125°×0.125°)wind data for the Netherlands every 6 h and at six height levels.The wind data were first transformed into two spatio-temporal sequence databases(for speed and direction,respectively).Then,the Linear time Closed Itemset Miner Sequence algorithm was used to extract the multidimensional sequential patterns,which were then visualized using a 3D wind rose,a circular histogram and a geographical map.These patterns were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines.Our analysis identified four frequent wind profile patterns.One of them highly suitable to harvest wind energy at a height of 128 m and 68.97%of the geographical area covered by this pattern already contains wind turbines.This study shows that the proposed approach is capable of efficiently extracting meaningful patterns from complex spatio-temporal datasets.
基金funded by the Enterprise Ireland Innovation Partnership Programme with Ericsson under grant agreement IP/2011/0135[6]supported by the National Natural Science Foundation of China(No.61373131,61303039,61232016,61501247)+1 种基金the PAPDCICAEET funds
文摘The rapid development of network technology and its evolution toward heterogeneous networks has increased the demand to support automatic monitoring and the management of heterogeneous wireless communication networks.This paper presents a multilevel pattern mining architecture to support automatic network management by discovering interesting patterns from telecom network monitoring data.This architecture leverages and combines existing frequent itemset discovery over data streams,association rule deduction,frequent sequential pattern mining,and frequent temporal pattern mining techniques while also making use of distributed processing platforms to achieve high-volume throughput.
基金Supported by the National Natural Science Foundation of China(U22B6002)PetroChina Science Research and Technology Development Project(2021DJ0101)。
文摘This paper expounds the basic principles and structures of the whole petroleum system to reveal the pattern of conventional oil/gas-tight oil/gas-shale oil/gas sequential accumulation and the hydrocarbon accumulation models and mechanisms of the whole petroleum system.It delineates the geological model,flow model,and production mechanism of shale and tight reservoirs,and proposes future research orientations.The main structure of the whole petroleum system includes three fluid dynamic fields,three types of oil and gas reservoirs/resources,and two types of reservoir-forming processes.Conventional oil/gas,tight oil/gas,and shale oil/gas are orderly in generation time and spatial distribution,and sequentially rational in genetic mechanism,showing the pattern of sequential accumulation.The whole petroleum system involves two categories of hydrocarbon accumulation models:hydrocarbon accumulation in the detrital basin and hydrocarbon accumulation in the carbonate basin/formation.The accumulation of unconventional oil/gas is self-containment,which is microscopically driven by the intermolecular force(van der Waals force).The unconventional oil/gas production has proved that the geological model,flow model,and production mechanism of shale and tight reservoirs represent a new and complex field that needs further study.Shale oil/gas must be the most important resource replacement for oil and gas resources of China.Future research efforts include:(1)the characteristics of the whole petroleum system in carbonate basins and the source-reservoir coupling patterns in the evolution of composite basins;(2)flow mechanisms in migration,accumulation,and production of shale oil/gas and tight oil/gas;(3)geological characteristics and enrichment of deep and ultra-deep shale oil/gas,tight oil/gas and coalbed methane;(4)resource evaluation and new generation of basin simulation technology of the whole petroleum system;(5)research on earth system-earth organic rock and fossil fuel system-whole petroleum system.
基金support of the National Natural Science Foundation of China(No.72074170).
文摘Pedestrian group detection is a challenging but significant issue in pedestrian flow control and public safety management.To address the issue that most conventional pedestrian grouping models(PGMs)can only identify a pedestrian group at a limited distance of less than 2 m,this study extended the pedestrian distance constraint of conventional PGMs with a reconstruction of the normal group detection criterion and development of a novelgroup detection criterion suitable for long-span space.To measure the movement behaviorsimilarity with normal distance,five necessary constraints:velocity difference,moving direction offset,distance limitation,distance fluctuation,and group-keeping duration were studied quantitatively to form the criterion to detect normal groups.Meanwhile,a long-span group detection criterion was proposed with extended distance and direction con-sistency constraints.Therefore,this study proposed an improved PGM that considers long-span spaces(PGMLS).In the PGMLS workflow,the MMTrack algorithm was used to obtainpedestrian trajectories.A difference measurement method based on sequential pattern analysis(SPA)was adopted to analyze the velocity similarity of pedestrians.To validate the proposed grouping model,experiments based on pedestrian movement videos in the exit hall of the Shanghai Hongqiao International Airport were conducted.The results indicate that the proposed model can detect both normal and widely separated pedestrian groups,with a long span range of 2-12 m.
基金This work was supported by the National Natural Science Foundation of China under Grant Nos. 61602282, 61772321, 61472231 and 71301086, and the China Postdoctoral Science Foundation under Grant No. 2016M602181.
文摘Recommender systems as one of the most efficient information filtering techniques have been widely studied in recent years. However, traditional recommender systems only utilize user-item rating matrix for recommendations, and the social connections and item sequential patterns are ignored. But in our real life, we always turn to our friends for recommendations, and often select the items that have similar sequential patterns. In order to overcome these challenges, many studies have taken social connections and sequential information into account to enhance recommender systems. Although these existing studies have achieved good results, most of them regard social influence and sequential information as regularization terms, and the deep structure hidden in social networks and rating patterns has not been fully explored. On the other hand, neural network based embedding methods have shown their power in many recommendation tasks with their ability to extract high-level representations from raw data. Motivated by the above observations, we take the advantage of network embedding techniques and propose an embedding-based recommendation method, which is composed of the embedding model and the collaborative filtering model. Specifically, to exploit the deep structure hidden in social networks and rating patterns, a neural network based embedding model is first pre-trained, where the external user and item representations are extracted. Then, we incorporate these extracted factors into a collaborative filtering model by fusing them with latent factors linearly, where our method not only can leverage the external information to enhance recommendation, but also can exploit the advantage of collaborative filtering techniques. Experimental results on two real-world datasets demonstrate the effectiveness of our proposed method and the importance of these external extracted factors.
基金the National Basic Research Program of China,973 program(2013CB228203).
文摘An enhanced cascading failure model integrating data mining technique is proposed in this paper.In order to better simulate the process of cascading failure propagation and further analyze the relationship between failure chains,in view of a basic framework of cascading failure described in this paper,some significant improvements in emerging prevention and control measures,the subsequent failure search strategy as well as the statistical analysis for the failure chains are made elaborately.Especially,a sequential pattern mining model is employed to find out the association pertinent to the obtained failure chains.In addition,a cluster analysis model is applied to evaluate the relationship between the intermediate data and the consequence of obtained failure chain,which can provide the prediction in potential propagation path of cascading failure to reduce the risk of catastrophic events.Finally,the case studies are conducted on the IEEE 10-machine-39-bus test system as benchmark to demonstrate the validity and effectiveness of the proposed enhanced cascading failure model.Some preliminary concluding remarks and comments are drawn.
基金This work is supported by the Ordinary University Innovation Project of Guangdong Province(Nos.2014KTSCX212,2014KQNCX24).
文摘In the era of global Internet security threats,there is an urgent need for different organizations to cooperate and jointly fight against cyber attacks.We present an algorithm that combines a privacy-preserving technique and a multi-step attack-correlation method to better balance the privacy and availability of alarm data.This algorithm is used to construct multi-step attack scenarios by discovering sequential attack-behavior patterns.It analyzes the time-sequential characteristics of attack behaviors and implements a support-evaluation method.Optimized candidate attack-sequence generation is applied to solve the problem of pre-defined association-rule complexity,as well as expert-knowledge dependency.An enhanced k-anonymity method is applied to this algorithm to preserve privacy.Experimental results indicate that the algorithm has better performance and accuracy for multi-step attack correlation than other methods,and reaches a good balance between efficiency and privacy.
基金supported by Australian Research Council Linkage Project under Grant No. LP0775041the Early Career Researcher Grant under Grant No. 2007002448 from University of Technology, Sydney, Australia
文摘From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-consuming and the large number of patterns discovered also makes the pattern selection and classifier building very time-consuming. The fact is that, in sequence classification, it is much more important to discover discriminative patterns than a complete pattern set. In this paper, we propose a novel hierarchical algorithm to build sequential classifiers using discriminative sequential patterns. Firstly, we mine for the sequential patterns which axe the most strongly correlated to each target class. In this step, an aggressive strategy is employed to select a small set of sequential patterns. Secondly, pattern pruning and serial coverage test are done on the mined patterns. The patterns that pass the serial test are used to build the sub-classifier at the first level of the final classifier. And thirdly, the training samples that cannot be covered are fed back to the sequential pattern mining stage with updated parameters. This process continues until predefined interestingness measure thresholds are reached, or all samples axe covered. The patterns generated in each loop form the sub-classifier at each level of the final classifier. Within this framework, the searching space can be reduced dramatically while a good classification performance is achieved. The proposed algorithm is tested in a real-world business application for debt prevention in social security area. The novel sequence classification algorithm shows the effectiveness and efficiency for predicting debt occurrences based on customer activity sequence data.