A discord is a refinement of the concept of an anomalous subsequence of a time series.Being one of the topical issues of time series mining,discords discovery is applied in a wide range of real-world areas(medicine,as...A discord is a refinement of the concept of an anomalous subsequence of a time series.Being one of the topical issues of time series mining,discords discovery is applied in a wide range of real-world areas(medicine,astronomy,economics,climate modeling,predictive maintenance,energy consumption,etc.).In this article,we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory.We assumed that the time series is partitioned across the cluster nodes and achieved parallelization among the cluster nodes as well as within a single node.Within a cluster node,the algorithm employs a set of matrix data structures to store and index the subsequences of a time series,and to provide an efficient vectorization of computations on the accelerator.At each node,the algorithm processes its own partition and performs in two phases,namely candidate selection and discord refinement,with each phase requiring one linear scan through the partition.Then the local discords found are combined into the global candidate set and transmitted to each cluster node.Next,a node performs refinement of the global candidate set over its own partition resulting in the local true discord set.Finally,the global true discords set is constructed as intersection of the local true discord sets.The experimental evaluation on the real computer cluster with real and synthetic time series shows a high scalability of the proposed algorithm.展开更多
This paper describes a parallel computing platform using the existing facilities for the digital watershed model. In this paper, distributed multi-layered structure is applied to the computer cluster system, and the M...This paper describes a parallel computing platform using the existing facilities for the digital watershed model. In this paper, distributed multi-layered structure is applied to the computer cluster system, and the MPI-2 is adopted as a mature parallel programming standard. An agent is introduced which makes it possible to be multi-level fault-tolerant in software development. The communication protocol based on checkpointing and rollback recovery mechanism can realize the transaction reprocessing. Compared with conventional platform, the new system is able to make better use of the computing resource. Experimental results show the speedup ratio of the platform is almost 4 times as that of the conventional one, which demonstrates the high efficiency and good performance of the new approach.展开更多
Urban clusters are the expected products of high levels of industry and urbanization in a country, as well as being the basic units of participation in global competition. With respect to China, urban clusters are reg...Urban clusters are the expected products of high levels of industry and urbanization in a country, as well as being the basic units of participation in global competition. With respect to China, urban clusters are regarded as the dominant formation for boosting the Chinese urbanization process. However, to date, there is no coincident, efficient, and credible methodological system and set of techniques to identify Chinese urban clusters. This research investigates the potential of a computerized identification method supported by geographic information techniques to provide a better understanding of the distribution of Chinese urban clusters. The identification method is executed based on a geographic information database, a digital elevation model, and socio-economic data with the aid of ArcInfo Macro Language programming. In the method, preliminary boundaries are identified accord-ing to transportation accessibility, and final identifications are achieved from limiting city numbers, population, and GDP in a region with the aid of the rasterized socio-economic dataset. The results show that the method identifies nine Chinese urban clusters, i.e., Pearl River Delta, Lower Yangtze River Valley, Beijing-Tianjin-Hebei Region, Northeast China Plain, Middle Yangtze River Valley, Central China Plains, Western Taiwan Strait, Guanzhong and Chengdu-Chongqing urban clusters. This research represents the first study involving the computerized identification of Chinese urban clusters. Moreover, compared to other related studies, the study’s approach, which combines transportation accessibility and socio-economic characteristics, is shown to be a distinct, effective and reliable way of identifying urban clusters.展开更多
基金the Russian Foundation for Basic Research(Grant No.20-07-00140)the Ministry of Science and Higher Education of the Russian Federation(Government Order FENU-2020-0022).
文摘A discord is a refinement of the concept of an anomalous subsequence of a time series.Being one of the topical issues of time series mining,discords discovery is applied in a wide range of real-world areas(medicine,astronomy,economics,climate modeling,predictive maintenance,energy consumption,etc.).In this article,we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory.We assumed that the time series is partitioned across the cluster nodes and achieved parallelization among the cluster nodes as well as within a single node.Within a cluster node,the algorithm employs a set of matrix data structures to store and index the subsequences of a time series,and to provide an efficient vectorization of computations on the accelerator.At each node,the algorithm processes its own partition and performs in two phases,namely candidate selection and discord refinement,with each phase requiring one linear scan through the partition.Then the local discords found are combined into the global candidate set and transmitted to each cluster node.Next,a node performs refinement of the global candidate set over its own partition resulting in the local true discord set.Finally,the global true discords set is constructed as intersection of the local true discord sets.The experimental evaluation on the real computer cluster with real and synthetic time series shows a high scalability of the proposed algorithm.
基金the Creative Research Team Foundation of the National Natural Science Foundation of China (No. 50221903)
文摘This paper describes a parallel computing platform using the existing facilities for the digital watershed model. In this paper, distributed multi-layered structure is applied to the computer cluster system, and the MPI-2 is adopted as a mature parallel programming standard. An agent is introduced which makes it possible to be multi-level fault-tolerant in software development. The communication protocol based on checkpointing and rollback recovery mechanism can realize the transaction reprocessing. Compared with conventional platform, the new system is able to make better use of the computing resource. Experimental results show the speedup ratio of the platform is almost 4 times as that of the conventional one, which demonstrates the high efficiency and good performance of the new approach.
基金National Basic Research Program of China (973 Program),No.2010CB950904 National Key Technology Research & Development Program: No.2008BAH31B04 Swedish Science Foundation,No.348-2006-6638
文摘Urban clusters are the expected products of high levels of industry and urbanization in a country, as well as being the basic units of participation in global competition. With respect to China, urban clusters are regarded as the dominant formation for boosting the Chinese urbanization process. However, to date, there is no coincident, efficient, and credible methodological system and set of techniques to identify Chinese urban clusters. This research investigates the potential of a computerized identification method supported by geographic information techniques to provide a better understanding of the distribution of Chinese urban clusters. The identification method is executed based on a geographic information database, a digital elevation model, and socio-economic data with the aid of ArcInfo Macro Language programming. In the method, preliminary boundaries are identified accord-ing to transportation accessibility, and final identifications are achieved from limiting city numbers, population, and GDP in a region with the aid of the rasterized socio-economic dataset. The results show that the method identifies nine Chinese urban clusters, i.e., Pearl River Delta, Lower Yangtze River Valley, Beijing-Tianjin-Hebei Region, Northeast China Plain, Middle Yangtze River Valley, Central China Plains, Western Taiwan Strait, Guanzhong and Chengdu-Chongqing urban clusters. This research represents the first study involving the computerized identification of Chinese urban clusters. Moreover, compared to other related studies, the study’s approach, which combines transportation accessibility and socio-economic characteristics, is shown to be a distinct, effective and reliable way of identifying urban clusters.