The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parall...The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining.展开更多
Frequency domain wave equation forward modeling is a problem of solving large scale linear sparse systems which is often subject to the limits of computational efficiency and memory storage. Conventional Gaussian elim...Frequency domain wave equation forward modeling is a problem of solving large scale linear sparse systems which is often subject to the limits of computational efficiency and memory storage. Conventional Gaussian elimination cannot resolve the parallel computation of huge data. Therefore, we use the Gaussian elimination with static pivoting (GESP) method for sparse matrix decomposition and multi-source finite-difference modeling. The GESP method does not only improve the computational efficiency but also benefit the distributed parallel computation of matrix decomposition within a single frequency point. We test the proposed method using the classic Marmousi model. Both the single-frequency wave field and time domain seismic section show that the proposed method improves the simulation accuracy and computational efficiency and saves and makes full use of memory. This method can lay the basis for waveform inversion.展开更多
Wireless Sensor Network (WSN) nodes are severely limited by their power, communication bandwidth, and storage space, and the traditional signature algorithm is not suitable for WSN environments. In this paper, we pr...Wireless Sensor Network (WSN) nodes are severely limited by their power, communication bandwidth, and storage space, and the traditional signature algorithm is not suitable for WSN environments. In this paper, we present a ring signature scheme designed for WSNs. In this scheme, all of the wireless sensor nodes are divided into several sub-groups and the sub-group nodes are used to generate the signature instead of the WSN cluster nodes. This scheme can effectively avoid the single node failure problem, and it also has a high availability. All nodes are flee to sign their own message, and the nodes that generate signatures can simultaneously calculate their own part of the signature, meeting the distributed parallel computing requirements. Compared with the traditional ring signature, this scheme reduces the energy consumption, and therefore is very suitable for WSNs.展开更多
In this paper,we propose a novel spatial data index based on Hadoop:HQ-Tree.In HQ-Tree,we use PR QuadTrec to solve the problem of poor efficiency in parallel processing,which is caused by data insertion order and spac...In this paper,we propose a novel spatial data index based on Hadoop:HQ-Tree.In HQ-Tree,we use PR QuadTrec to solve the problem of poor efficiency in parallel processing,which is caused by data insertion order and space overlapping.For the problem that HDFS cannot support random write,we propose an updating mechanism,called "Copy Write",to support the index update.Additionally,HQ-Tree employs a two-level index caching mechanism to reduce the cost of network transferring and I/O operations.Finally,we develop MapReduce-based algorithms,which are able to significantly enhance the efficiency of index creation and query.Experimental results demonstrate the effectiveness of our methods.展开更多
Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process en...Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process environment for pre-processing performance can be improved by up to 560.65% from single processing system. Through this, analysis performance in various fields can be improved, and moreover, near-real time service can be achieved in near future.展开更多
This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's ...This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's thesis project "Optimization of complex tasks' computation on hybrid distributed computational structures" accomplished by Orekhov during which the main research objective was the determination of" patterns of the behavior of scaling efficiency and other parameters which define performance of different algorithms' implementations executed on hybrid distributed computational structures. Major outcomes and dependencies obtained within the master's thesis project were formed into a methodology which covers the problems of applications based on parallel computations and describes the process of its development in details, offering easy ways of avoiding potentially crucial problems. The paper is backed by the real-life examples such as clustering algorithms instead of artificial benchmarks.展开更多
The effective propagation constants of plane longitudinal and shear waves in nanoporous material with random distributed parallel cylindrical nanoholes are studied. The surface elastic theory is used to consider the s...The effective propagation constants of plane longitudinal and shear waves in nanoporous material with random distributed parallel cylindrical nanoholes are studied. The surface elastic theory is used to consider the surface stress effects and to derive the nontraditional boundary condition on the surface of nanoholes. The plane wave expansion method is used to obtain the scattering waves from the single nanohole. The multiple scattering effects are taken into consideration by summing the scat- tered waves from all scatterers and performing the configuration averaging of random distributed scatterers. The effective propagation constants of coherent waves along with the associated dynamic effective elastic modulus are numerically evaluat- ed. The influences of surface stress are discussed based on the numerical results.展开更多
Optimized task scheduling is one of the most important challenges to achieve high performance in multiprocessor environments such as parallel and distributed systems. Most introduced task-scheduling algorithms are bas...Optimized task scheduling is one of the most important challenges to achieve high performance in multiprocessor environments such as parallel and distributed systems. Most introduced task-scheduling algorithms are based on the so-called list scheduling technique. The basic idea behind list scheduling is to prepare a sequence of nodes in the form of a list for scheduling by assigning them some priority measurements, and then repeatedly removing the node with the highest priority from the list and allocating it to the processor providing the earliest start time (EST). Therefore, it can be inferred that the makespans obtained are dominated by two major factors: (1) which order of tasks should be selected (sequence subproblem); (2) how the selected order should be assigned to the processors (assignment subproblem). A number of good approaches for overcoming the task sequence dilemma have been proposed in the literature, while the task assignment problem has not been studied much. The results of this study prove that assigning tasks to the processors using the traditional EST method is not optimum; in addition, a novel approach based on the ant colony optimization algorithm is introduced, which can find far better solutions.展开更多
In this paper, the output consensus problem of general heterogeneous nonlinear multi-agent systems subject to different disturbances is considered. A kind of Takagi-Sukeno fuzzy modeling method is used to describe the...In this paper, the output consensus problem of general heterogeneous nonlinear multi-agent systems subject to different disturbances is considered. A kind of Takagi-Sukeno fuzzy modeling method is used to describe the nonlinear agents' dynamics. Based on the model, a distributed fuzzy observer and controller are designed based on parallel distributed compensation scheme and internal reference models such that the heterogeneous nonlinear multi-agent systems can achieve output consensus. Then a necessary and sufficient condition is presented for the output consensus problem. And it is shown that the consensus trajectory of the global fuzzy model is determined by the network topology and the initial states of the internal reference models. Finally, some simulations are given to illustrate and verify the effectiveness of the proposed scheme.展开更多
基金Project(KC18071)supported by the Application Foundation Research Program of Xuzhou,ChinaProjects(2017YFC0804401,2017YFC0804409)supported by the National Key R&D Program of China
文摘The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining.
基金supported by China State Key Science and Technology Project on Marine Carbonate Reservoir Characterization (No. 2008ZX05004-006)
文摘Frequency domain wave equation forward modeling is a problem of solving large scale linear sparse systems which is often subject to the limits of computational efficiency and memory storage. Conventional Gaussian elimination cannot resolve the parallel computation of huge data. Therefore, we use the Gaussian elimination with static pivoting (GESP) method for sparse matrix decomposition and multi-source finite-difference modeling. The GESP method does not only improve the computational efficiency but also benefit the distributed parallel computation of matrix decomposition within a single frequency point. We test the proposed method using the classic Marmousi model. Both the single-frequency wave field and time domain seismic section show that the proposed method improves the simulation accuracy and computational efficiency and saves and makes full use of memory. This method can lay the basis for waveform inversion.
基金This paper was supported by the National Natural Science Foundation of China under Grants No.61001091,No.61271118
文摘Wireless Sensor Network (WSN) nodes are severely limited by their power, communication bandwidth, and storage space, and the traditional signature algorithm is not suitable for WSN environments. In this paper, we present a ring signature scheme designed for WSNs. In this scheme, all of the wireless sensor nodes are divided into several sub-groups and the sub-group nodes are used to generate the signature instead of the WSN cluster nodes. This scheme can effectively avoid the single node failure problem, and it also has a high availability. All nodes are flee to sign their own message, and the nodes that generate signatures can simultaneously calculate their own part of the signature, meeting the distributed parallel computing requirements. Compared with the traditional ring signature, this scheme reduces the energy consumption, and therefore is very suitable for WSNs.
基金This work is supported by the National Natural Science Foundation of China under Grant No.61370091and No.61170200, Jiangsu Province Science and Technology Support Program (industry) Project under Grant No.BE2012179, Program Sponsored for Scientific Innovation Research of College Graduate in Jiangsu Province under Grant No. CXZZ12_0229.
文摘In this paper,we propose a novel spatial data index based on Hadoop:HQ-Tree.In HQ-Tree,we use PR QuadTrec to solve the problem of poor efficiency in parallel processing,which is caused by data insertion order and space overlapping.For the problem that HDFS cannot support random write,we propose an updating mechanism,called "Copy Write",to support the index update.Additionally,HQ-Tree employs a two-level index caching mechanism to reduce the cost of network transferring and I/O operations.Finally,we develop MapReduce-based algorithms,which are able to significantly enhance the efficiency of index creation and query.Experimental results demonstrate the effectiveness of our methods.
基金supported by the Sharing and Diffusion of National R&D Outcome funded by the Korea Institute of Science and Technology Information
文摘Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process environment for pre-processing performance can be improved by up to 560.65% from single processing system. Through this, analysis performance in various fields can be improved, and moreover, near-real time service can be achieved in near future.
文摘This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's thesis project "Optimization of complex tasks' computation on hybrid distributed computational structures" accomplished by Orekhov during which the main research objective was the determination of" patterns of the behavior of scaling efficiency and other parameters which define performance of different algorithms' implementations executed on hybrid distributed computational structures. Major outcomes and dependencies obtained within the master's thesis project were formed into a methodology which covers the problems of applications based on parallel computations and describes the process of its development in details, offering easy ways of avoiding potentially crucial problems. The paper is backed by the real-life examples such as clustering algorithms instead of artificial benchmarks.
基金the National Natural Science Foundation of China (Grant Nos. 10972029 and 40906044)the Youth Scientific Research Foundation PLA University of Science and Technology (Grant No. 20110510)
文摘The effective propagation constants of plane longitudinal and shear waves in nanoporous material with random distributed parallel cylindrical nanoholes are studied. The surface elastic theory is used to consider the surface stress effects and to derive the nontraditional boundary condition on the surface of nanoholes. The plane wave expansion method is used to obtain the scattering waves from the single nanohole. The multiple scattering effects are taken into consideration by summing the scat- tered waves from all scatterers and performing the configuration averaging of random distributed scatterers. The effective propagation constants of coherent waves along with the associated dynamic effective elastic modulus are numerically evaluat- ed. The influences of surface stress are discussed based on the numerical results.
基金Project supported by Sama Technical and Vocational Training College,Islamic Azad University,Shoushtar Branch,Shoushtar,Iran
文摘Optimized task scheduling is one of the most important challenges to achieve high performance in multiprocessor environments such as parallel and distributed systems. Most introduced task-scheduling algorithms are based on the so-called list scheduling technique. The basic idea behind list scheduling is to prepare a sequence of nodes in the form of a list for scheduling by assigning them some priority measurements, and then repeatedly removing the node with the highest priority from the list and allocating it to the processor providing the earliest start time (EST). Therefore, it can be inferred that the makespans obtained are dominated by two major factors: (1) which order of tasks should be selected (sequence subproblem); (2) how the selected order should be assigned to the processors (assignment subproblem). A number of good approaches for overcoming the task sequence dilemma have been proposed in the literature, while the task assignment problem has not been studied much. The results of this study prove that assigning tasks to the processors using the traditional EST method is not optimum; in addition, a novel approach based on the ant colony optimization algorithm is introduced, which can find far better solutions.
基金supported in part by the National Natural Science Foundation of China under Grant Nos.61375105 and 61403334Chinese Postdoctoral Science Fundation under Grant No.2015M581318
文摘In this paper, the output consensus problem of general heterogeneous nonlinear multi-agent systems subject to different disturbances is considered. A kind of Takagi-Sukeno fuzzy modeling method is used to describe the nonlinear agents' dynamics. Based on the model, a distributed fuzzy observer and controller are designed based on parallel distributed compensation scheme and internal reference models such that the heterogeneous nonlinear multi-agent systems can achieve output consensus. Then a necessary and sufficient condition is presented for the output consensus problem. And it is shown that the consensus trajectory of the global fuzzy model is determined by the network topology and the initial states of the internal reference models. Finally, some simulations are given to illustrate and verify the effectiveness of the proposed scheme.