This paper presented an idea to replace the traditionally expensive parallel machines by heterogeneous cluster of workstations. To emphasise the usability of cluster of workstations platform for parallel and distribut...This paper presented an idea to replace the traditionally expensive parallel machines by heterogeneous cluster of workstations. To emphasise the usability of cluster of workstations platform for parallel and distributed computing, also the paper presented the status report on the effort and experiences for the implementation of a dynamic load balancing for parallel tree computation depth first search(DFS) on the cluster of a workstations project. It compared the speedup performance obtained from our platform with that obtained from the traditional one. The speedup results show that cluster of workstations can be a serious alternative to the expensive parallel machines.展开更多
The real problem in cluster of workstations is the changes in workstation power or number of workstations or dynmaic changes in the run time behavior of the application hamper the efficient use of resources. Dynamic l...The real problem in cluster of workstations is the changes in workstation power or number of workstations or dynmaic changes in the run time behavior of the application hamper the efficient use of resources. Dynamic load balancing is a technique for the parallel implementation of problems, which generate unpredictable workloads by migration work units from heavily loaded processor to lightly loaded processors at run time. This paper proposed an efficient load balancing method in which parallel tree computations depth first search (DFS) generates unpredictable, highly imbalance workloads and moves through different phases detectable at run time, where dynamic load balancing strategy is applicable in each phase running under the MPI(message passing interface) and Unix operating system on cluster of workstations parallel platform computing.展开更多
E-mail, WWW, FTP, BT and QQlive, etc. axe used more and more universal because the advantage of Internet, but the data-omitting phenomenon is a headache problem. In this paper, we consider the problem of allocating a ...E-mail, WWW, FTP, BT and QQlive, etc. axe used more and more universal because the advantage of Internet, but the data-omitting phenomenon is a headache problem. In this paper, we consider the problem of allocating a large number of independent, unequal-sized loads exchanged between servers and clients or between themselves when there are data-omitting, and we describe the dynamic load balancing problems by intro- ducing some parameters αij, we use an undirected graph to model the platform, where servers (CPU time, disk memory) can have different speeds of computation and communication. Because the number of loads is large, we focus on the question of determining the optimal dynamic load balancing scheduling strategy (splittable strategy) for each processor (the fraction of time spent computing and the fraction of time spent communication with each neighbor). We show that finding the optimal dynamic load balancing state can be solved using a linear programming approach by adding more constrains and, thus, in polynomial time. And make the execute time minimization.展开更多
针对并行MRPrePost(parallel prepost algorithm based on MapReduce)频繁项集挖掘算法在大数据环境存在运行时间长、内存占用量大和节点负载不均衡的问题,提出一种基于DiffNodeset的并行频繁项集挖掘算法(parallel frequent itemsets m...针对并行MRPrePost(parallel prepost algorithm based on MapReduce)频繁项集挖掘算法在大数据环境存在运行时间长、内存占用量大和节点负载不均衡的问题,提出一种基于DiffNodeset的并行频繁项集挖掘算法(parallel frequent itemsets mining using DiffNodeset,PFIMD)。该算法首先采用一种数据结构DiffNodeset,有效地避免了N-list基数过大的问题;此外提出一种双向比较策略(2-way comparison strategy,T-wcs),以减少两个DiffNodeset在连接过程中的无效计算,极大地降低了算法时间复杂度;最后考虑到集群负载对并行算法效率的影响,进一步提出了一种基于动态分组的负载均衡策略(load balancing strategy based on dynamic grouping,LBSBDG),该策略通过将频繁1项集F-list中的每项进行均匀分组,降低了集群中每个计算节点上PPC-Tree树的规模,进而减少了先序后序遍历PPC-Tree树所需的时间。实验结果表明,该算法在大数据环境下进行频繁项集挖掘具有较好的效果。展开更多
针对大数据环境下DCNN(deep convolutional neural network)算法中存在网络冗余参数过多、参数寻优能力不佳和并行效率低的问题,提出了大数据环境下基于特征图和并行计算熵的深度卷积神经网络算法MR-FPDCNN(deep convolutional neural n...针对大数据环境下DCNN(deep convolutional neural network)算法中存在网络冗余参数过多、参数寻优能力不佳和并行效率低的问题,提出了大数据环境下基于特征图和并行计算熵的深度卷积神经网络算法MR-FPDCNN(deep convolutional neural network algorithm based on feature graph and parallel computing entropy using MapReduce)。该算法设计了基于泰勒损失的特征图剪枝策略FMPTL(feature map pruning based on Taylor loss),预训练网络,获得压缩后的DCNN,有效减少了冗余参数,降低了DCNN训练的计算代价。提出了基于信息共享搜索策略ISS(information sharing strategy)的萤火虫优化算法IFAS(improved firefly algorithm based on ISS),根据“IFAS”算法初始化DCNN参数,实现DCNN的并行化训练,提高网络的寻优能力。在Reduce阶段提出了基于并行计算熵的动态负载均衡策略DLBPCE(dynamic load balancing strategy based on parallel computing entropy),获取全局训练结果,实现了数据的快速均匀分组,从而提高了集群的并行效率。实验结果表明,该算法不仅降低了DCNN在大数据环境下训练的计算代价,而且提高了并行系统的并行化性能。展开更多
基金National Science Foundation of China(No.60 173 0 3 1)
文摘This paper presented an idea to replace the traditionally expensive parallel machines by heterogeneous cluster of workstations. To emphasise the usability of cluster of workstations platform for parallel and distributed computing, also the paper presented the status report on the effort and experiences for the implementation of a dynamic load balancing for parallel tree computation depth first search(DFS) on the cluster of a workstations project. It compared the speedup performance obtained from our platform with that obtained from the traditional one. The speedup results show that cluster of workstations can be a serious alternative to the expensive parallel machines.
基金Natural Science Foundation of China (No.60 173 0 3 1)
文摘The real problem in cluster of workstations is the changes in workstation power or number of workstations or dynmaic changes in the run time behavior of the application hamper the efficient use of resources. Dynamic load balancing is a technique for the parallel implementation of problems, which generate unpredictable workloads by migration work units from heavily loaded processor to lightly loaded processors at run time. This paper proposed an efficient load balancing method in which parallel tree computations depth first search (DFS) generates unpredictable, highly imbalance workloads and moves through different phases detectable at run time, where dynamic load balancing strategy is applicable in each phase running under the MPI(message passing interface) and Unix operating system on cluster of workstations parallel platform computing.
基金This work is supported by National Natural Science Foundation of China (1067108) Scientific and technological project of Hubei province (2006AA412C27) Science Foundation of Three Gorges University (604401).
文摘E-mail, WWW, FTP, BT and QQlive, etc. axe used more and more universal because the advantage of Internet, but the data-omitting phenomenon is a headache problem. In this paper, we consider the problem of allocating a large number of independent, unequal-sized loads exchanged between servers and clients or between themselves when there are data-omitting, and we describe the dynamic load balancing problems by intro- ducing some parameters αij, we use an undirected graph to model the platform, where servers (CPU time, disk memory) can have different speeds of computation and communication. Because the number of loads is large, we focus on the question of determining the optimal dynamic load balancing scheduling strategy (splittable strategy) for each processor (the fraction of time spent computing and the fraction of time spent communication with each neighbor). We show that finding the optimal dynamic load balancing state can be solved using a linear programming approach by adding more constrains and, thus, in polynomial time. And make the execute time minimization.
文摘针对并行MRPrePost(parallel prepost algorithm based on MapReduce)频繁项集挖掘算法在大数据环境存在运行时间长、内存占用量大和节点负载不均衡的问题,提出一种基于DiffNodeset的并行频繁项集挖掘算法(parallel frequent itemsets mining using DiffNodeset,PFIMD)。该算法首先采用一种数据结构DiffNodeset,有效地避免了N-list基数过大的问题;此外提出一种双向比较策略(2-way comparison strategy,T-wcs),以减少两个DiffNodeset在连接过程中的无效计算,极大地降低了算法时间复杂度;最后考虑到集群负载对并行算法效率的影响,进一步提出了一种基于动态分组的负载均衡策略(load balancing strategy based on dynamic grouping,LBSBDG),该策略通过将频繁1项集F-list中的每项进行均匀分组,降低了集群中每个计算节点上PPC-Tree树的规模,进而减少了先序后序遍历PPC-Tree树所需的时间。实验结果表明,该算法在大数据环境下进行频繁项集挖掘具有较好的效果。
文摘针对大数据环境下DCNN(deep convolutional neural network)算法中存在网络冗余参数过多、参数寻优能力不佳和并行效率低的问题,提出了大数据环境下基于特征图和并行计算熵的深度卷积神经网络算法MR-FPDCNN(deep convolutional neural network algorithm based on feature graph and parallel computing entropy using MapReduce)。该算法设计了基于泰勒损失的特征图剪枝策略FMPTL(feature map pruning based on Taylor loss),预训练网络,获得压缩后的DCNN,有效减少了冗余参数,降低了DCNN训练的计算代价。提出了基于信息共享搜索策略ISS(information sharing strategy)的萤火虫优化算法IFAS(improved firefly algorithm based on ISS),根据“IFAS”算法初始化DCNN参数,实现DCNN的并行化训练,提高网络的寻优能力。在Reduce阶段提出了基于并行计算熵的动态负载均衡策略DLBPCE(dynamic load balancing strategy based on parallel computing entropy),获取全局训练结果,实现了数据的快速均匀分组,从而提高了集群的并行效率。实验结果表明,该算法不仅降低了DCNN在大数据环境下训练的计算代价,而且提高了并行系统的并行化性能。