期刊文献+
共找到33篇文章
< 1 2 >
每页显示 20 50 100
Optimization Techniques for GPU-Based Parallel Programming Models in High-Performance Computing
1
作者 Shuntao Tang Wei Chen 《信息工程期刊(中英文版)》 2024年第1期7-11,共5页
This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g... This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology. 展开更多
关键词 Optimization Techniques GPU-Based parallel programming Models High-Performance Computing
下载PDF
Parallel Image Processing: Taking Grayscale Conversion Using OpenMP as an Example
2
作者 Bayan AlHumaidan Shahad Alghofaily +2 位作者 Maitha Al Qhahtani Sara Oudah Naya Nagy 《Journal of Computer and Communications》 2024年第2期1-10,共10页
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl... In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks. 展开更多
关键词 parallel Computing Image Processing OPENMP parallel programming High Performance Computing GPU (Graphic Processing Unit)
下载PDF
Grid Service Framework: Supporting Multi-Models Parallel Grid Programming
3
作者 邓倩妮 陆鑫达 《Journal of Shanghai Jiaotong university(Science)》 EI 2004年第1期56-59,共4页
Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platfor... Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing. 展开更多
关键词 web service volunteer computing grid computing parallel programming garadigm
下载PDF
The parallel 3D magnetotelluric forward modeling algorithm 被引量:28
4
作者 Tan Handong Tong Tuo Lin Changhong 《Applied Geophysics》 SCIE CSCD 2006年第4期197-202,共6页
The workload of the 3D magnetotelluric forward modeling algorithm is so large that the traditional serial algorithm costs an extremely large compute time. However, the 3D forward modeling algorithm can process the dat... The workload of the 3D magnetotelluric forward modeling algorithm is so large that the traditional serial algorithm costs an extremely large compute time. However, the 3D forward modeling algorithm can process the data in the frequency domain, which is very suitable for parallel computation. With the advantage of MPI and based on an analysis of the flow of the 3D magnetotelluric serial forward algorithm, we suggest the idea of parallel computation and apply it. Three theoretical models are tested and the execution efficiency is compared in different situations. The results indicate that the parallel 3D forward modeling computation is correct and the efficiency is greatly improved. This method is suitable for large size geophysical computations. 展开更多
关键词 Magnetotelluric 3D forward modeling MPI parallel programming design 3D staggered-grid finite difference method parallel algorithm.
下载PDF
An Approach to Parallelization of SIFT Algorithm on GPUs for Real-Time Applications 被引量:4
5
作者 Raghu Raj Prasanna Kumar Suresh Muknahallipatna John McInroy 《Journal of Computer and Communications》 2016年第17期18-50,共33页
Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible fo... Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible for single threaded im-plementation to extract local feature descriptors for high-resolution images in real time. In this paper, an approach to parallelization of the SIFT algorithm is demonstrated using NVIDIA’s Graphics Processing Unit (GPU). The parallel-ization design for SIFT on GPUs is divided into two stages, a) Algorithm de-sign-generic design strategies which focuses on data and b) Implementation de-sign-architecture specific design strategies which focuses on optimally using GPU resources for maximum occupancy. Increasing memory latency hiding, eliminating branches and data blocking achieve a significant decrease in aver-age computational time. Furthermore, it is observed via Paraver tools that our approach to parallelization while optimizing for maximum occupancy allows GPU to execute memory bound SIFT algorithm at optimal levels. 展开更多
关键词 Scale Invariant Feature Transform (SIFT) parallel Computing GPU GPU Occupancy Portable parallel programming CUDA
下载PDF
Asynchronous Nested Optimization Algorithms and Their Parallel Implementation
6
作者 Hans W. Moritsch, G.Ch. Pflug, M. Siomak Department of Statistics and Decision Support Systems,University of Vienna, Universitaetsstrasse 5 A\|1090 Vienna, Austria 《Wuhan University Journal of Natural Sciences》 CAS 2001年第Z1期560-567,共8页
Large scale optimization problems can only be solved in an efficient way, if their special structure is taken as the basis of algorithm design. In this paper we consider a very broad class of large-scale problems ... Large scale optimization problems can only be solved in an efficient way, if their special structure is taken as the basis of algorithm design. In this paper we consider a very broad class of large-scale problems with special structure, namely tree structured problems. We show how the exploitation of the structure leads to efficient decomposition algorithms and how it may be implemented in a parallel environment. 展开更多
关键词 financial management stochastic optimization tree structured problems parallel programming JAVA
下载PDF
Improvements in the score matrix calculation method using parallel score estimating algorithm
7
作者 Geraldo F.D.Zafalon Evandro A.Marucci +3 位作者 Julio C.Momente Jose R.A.Amazonas Liria M.Sato Jose M.Machado 《Journal of Biophysical Chemistry》 2013年第2期47-51,共5页
The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to alig... The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to align and analyze the sequences, providing improvements mainly in the running time of these algorithms. In many situations, the parallel strategy contributes to reducing the computational complexity of the big problems. This work shows some results obtained by an implementation of a parallel score estimating technique for the score matrix calculation stage, which is the first stage of a progressive multiple sequence alignment. The performance and quality of the parallel score estimating are compared with the results of a dynamic programming approach also implemented in parallel. This comparison shows a significant reduction of running time. Moreover, the quality of the final alignment, using the new strategy, is analyzed and compared with the quality of the approach with dynamic programming. 展开更多
关键词 ALGORITHMS Scoring Matrix parallel programming Alignment Quality
下载PDF
Parallel Dispatch:A New Paradigm of Electrical Power System Dispatch 被引量:5
8
作者 Jun Jason Zhang Fei-Yue Wang +5 位作者 Qiang Wang Dazhi Hao Xiaojing Yang David Wenzhong Gao Xiangyang Zhao Yingchen Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期311-319,共9页
Modern power systems are evolving into sociotechnical systems with massive complexity, whose real-time operation and dispatch go beyond human capability. Thus,the need for developing and applying new intelligent power... Modern power systems are evolving into sociotechnical systems with massive complexity, whose real-time operation and dispatch go beyond human capability. Thus,the need for developing and applying new intelligent power system dispatch tools are of great practical significance. In this paper, we introduce the overall business model of power system dispatch, the top level design approach of an intelligent dispatch system, and the parallel intelligent technology with its dispatch applications. We expect that a new dispatch paradigm,namely the parallel dispatch, can be established by incorporating various intelligent technologies, especially the parallel intelligent technology, to enable secure operation of complex power grids,extend system operators' capabilities, suggest optimal dispatch strategies, and to provide decision-making recommendations according to power system operational goals. 展开更多
关键词 ACP knowledge automation power dispatch parallel dynamic programming parallel intelligence paralle learning situational awareness
下载PDF
Rapid Optimization of Tension Distribution for Cable-Driven Parallel Manipulators with Redundant Cables 被引量:8
9
作者 OUYANG Bo SHANG Weiwei 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2016年第2期231-238,共8页
The solution of tension distributions is infinite for cable-driven parallel manipulators(CDPMs) with redundant cables. A rapid optimization method for determining the optimal tension distribution is presented. The n... The solution of tension distributions is infinite for cable-driven parallel manipulators(CDPMs) with redundant cables. A rapid optimization method for determining the optimal tension distribution is presented. The new optimization method is primarily based on the geometry properties of a polyhedron and convex analysis. The computational efficiency of the optimization method is improved by the designed projection algorithm, and a fast algorithm is proposed to determine which two of the lines are intersected at the optimal point. Moreover, a method for avoiding the operating point on the lower tension limit is developed. Simulation experiments are implemented on a six degree-of-freedom(6-DOF) CDPM with eight cables, and the results indicate that the new method is one order of magnitude faster than the standard simplex method. The optimal distribution of tension distribution is thus rapidly established on real-time by the proposed method. 展开更多
关键词 cable-driven parallel manipulator tension distribution redundant cable linear programming
下载PDF
Parallel Implementations of Modeling Dynamical Systems by Using System of Ordinary Differential Equations
10
作者 Cao Hong-qing, Kang Li-shan, Yu Jing-xianState Key Laboratory of Software Engineering, Wuhan University, Wuhan 430072,Hubei,ChinaCollege of Chemistry and Molecular Sciences, Wuhan University, Wuhan 430072, Hubei, China 《Wuhan University Journal of Natural Sciences》 CAS 2003年第S1期229-233,共5页
First, an asynchronous distributed parallel evolutionary modeling algorithm (PEMA) for building the model of system of ordinary differential equations for dynamical systems is proposed in this paper. Then a series of ... First, an asynchronous distributed parallel evolutionary modeling algorithm (PEMA) for building the model of system of ordinary differential equations for dynamical systems is proposed in this paper. Then a series of parallel experiments have been conducted to systematically test the influence of some important parallel control parameters on the performance of the algorithm. A lot of experimental results are obtained and we make some analysis and explanations to them. 展开更多
关键词 parallel genetic programming evolutionary modeling system of ordinary differential equations
下载PDF
A Trace-state Based Approach to Specification and Design of Parallel Programs
11
作者 He Jifeng Oxford University Computing LaboratoryProgramming Research Group Parks Road, Oxford OXl 3QD, England 《计算机工程》 CAS CSCD 北大核心 1996年第S1期91-105,共15页
In this paper they deal with the issue of specification and design of parallel communicatingprocesses. A trace-state based model is introduced to describe the behaviour of concurrent programs. They presenta formal sys... In this paper they deal with the issue of specification and design of parallel communicatingprocesses. A trace-state based model is introduced to describe the behaviour of concurrent programs. They presenta formal system based on that model to achieve hierarchical and modular development and verification methods. Anumber of refinement rules are used to decompose the specification into smaller ones and calculate program fromthe 展开更多
关键词 COMM A Trace-state Based Approach to Specification and Design of parallel Programs
下载PDF
An Imbalanced Dataset and Class Overlapping Classification Model for Big Data 被引量:1
12
作者 Mini Prince P.M.Joe Prathap 《Computer Systems Science & Engineering》 SCIE EI 2023年第2期1009-1024,共16页
Most modern technologies,such as social media,smart cities,and the internet of things(IoT),rely on big data.When big data is used in the real-world applications,two data challenges such as class overlap and class imba... Most modern technologies,such as social media,smart cities,and the internet of things(IoT),rely on big data.When big data is used in the real-world applications,two data challenges such as class overlap and class imbalance arises.When dealing with large datasets,most traditional classifiers are stuck in the local optimum problem.As a result,it’s necessary to look into new methods for dealing with large data collections.Several solutions have been proposed for overcoming this issue.The rapid growth of the available data threatens to limit the usefulness of many traditional methods.Methods such as oversampling and undersampling have shown great promises in addressing the issues of class imbalance.Among all of these techniques,Synthetic Minority Oversampling TechniquE(SMOTE)has produced the best results by generating synthetic samples for the minority class in creating a balanced dataset.The issue is that their practical applicability is restricted to problems involving tens of thousands or lower instances of each.In this paper,we have proposed a parallel mode method using SMOTE and MapReduce strategy,this distributes the operation of the algorithm among a group of computational nodes for addressing the aforementioned problem.Our proposed solution has been divided into three stages.Thefirst stage involves the process of splitting the data into different blocks using a mapping function,followed by a pre-processing step for each mapping block that employs a hybrid SMOTE algo-rithm for solving the class imbalanced problem.On each map block,a decision tree model would be constructed.Finally,the decision tree blocks would be com-bined for creating a classification model.We have used numerous datasets with up to 4 million instances in our experiments for testing the proposed scheme’s cap-abilities.As a result,the Hybrid SMOTE appears to have good scalability within the framework proposed,and it also cuts down the processing time. 展开更多
关键词 Imbalanced dataset class overlapping SMOTE MAPREDUCE parallel programming OVERSAMPLING
下载PDF
An Approach to Parallel Simulation of Ordinary Differential Equations
13
作者 Joshua D. Carl Gautam Biswas 《Journal of Software Engineering and Applications》 2016年第5期250-290,共41页
Cyber-physical systems (CPS) represent a class of complex engineered systems where functionality and behavior emerge through the interaction between the computational and physical domains. Simulation provides design e... Cyber-physical systems (CPS) represent a class of complex engineered systems where functionality and behavior emerge through the interaction between the computational and physical domains. Simulation provides design engineers with quick and accurate feedback on the behaviors generated by their designs. However, as systems become more complex, simulating their behaviors becomes computation all complex. But, most modern simulation environments still execute on a single thread, which does not take advantage of the processing power available on modern multi-core CPUs. This paper investigates methods to partition and simulate differential equation-based models of cyber-physical systems using multiple threads on multi-core CPUs that can share data across threads. We describe model partitioning methods using fixed step and variable step numerical in-tegration methods that consider the multi-layer cache structure of these CPUs to avoid simulation performance degradation due to cache conflicts. We study the effectiveness of each parallel simu-lation algorithm by calculating the relative speedup compared to a serial simulation applied to a series of large electric circuit models. We also develop a series of guidelines for maximizing performance when developing parallel simulation software intended for use on multi-core CPUs. 展开更多
关键词 parallel and Multi-Thread programming Ordinary Differential Equations SIMULATION
下载PDF
Programming bare-metal accelerators with heterogeneous threading models:a case study of Matrix-3000 被引量:1
14
作者 Jianbin FANG Peng ZHANG +4 位作者 Chun HUANG Tao TANG Kai LU Ruibo WANG Zheng WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第4期509-520,共12页
As the hardware industry moves toward using specialized heterogeneous many-core processors to avoid the effects of the power wall,software developers are finding it hard to deal with the complexity of these systems.In... As the hardware industry moves toward using specialized heterogeneous many-core processors to avoid the effects of the power wall,software developers are finding it hard to deal with the complexity of these systems.In this paper,we share our experience of developing a programming model and its supporting compiler and libraries for Matrix-3000,which is designed for next-generation exascale supercomputers but has a complex memory hierarchy and processor organization.To assist its software development,we have developed a software stack from scratch that includes a low-level programming interface and a high-level OpenCL compiler.Our low-level programming model offers native programming support for using the bare-metal accelerators of Matrix-3000,while the high-level model allows programmers to use the OpenCL programming standard.We detail our design choices and highlight the lessons learned from developing system software to enable the programming of bare-metal accelerators.Our programming models have been deployed in the production environment of an exascale prototype system. 展开更多
关键词 Heterogeneous computing parallel programming models PROGRAMMABILITY COMPILERS Runtime systems
原文传递
Applying Non-Local Means Filter on Seismic Exploration
15
作者 Mustafa Youldash Saleh Al-Dossary +4 位作者 Lama AlDaej Farah AlOtaibi Asma AlDubaikil Noora AlBinali Maha AlGhamdi 《Computer Systems Science & Engineering》 SCIE EI 2022年第2期619-628,共10页
The seismic reflection method is one of the most important methods in geophysical exploration.There are three stages in a seismic exploration survey:acquisition,processing,and interpretation.This paper focuses on a pr... The seismic reflection method is one of the most important methods in geophysical exploration.There are three stages in a seismic exploration survey:acquisition,processing,and interpretation.This paper focuses on a pre-processing tool,the Non-Local Means(NLM)filter algorithm,which is a powerful technique that can significantly suppress noise in seismic data.However,the domain of the NLM algorithm is the whole dataset and 3D seismic data being very large,often exceeding one terabyte(TB),it is impossible to store all the data in Random Access Memory(RAM).Furthermore,the NLM filter would require a considerably long runtime.These factors make a straightforward implementation of the NLM algorithm on real geophysical exploration data infeasible.This paper redesigned and implemented the NLM filter algorithm to fit the challenges of seismic exploration.The optimized implementation of the NLM filter is capable of processing production-size seismic data on modern clusters and is 87 times faster than the straightforward implementation of NLM. 展开更多
关键词 Seismic exploration parallel programming seismic processing optimizing methods
下载PDF
LICOM3.0海洋模式中太平洋北赤道逆流的模拟偏差分析 被引量:1
16
作者 孙志阔 刘海龙 +2 位作者 林鹏飞 于子棚 李逸文 《大气科学》 CSCD 北大核心 2020年第3期591-600,共10页
本文用CORE-IAF(Coordinated Ocean-ice Reference Experiments–Interannual Forcing)外强迫场分别强迫LICOM3(LASG/IAP Climate System Ocean Model Version 3)和POP2(Parallel Ocean Program version 2)两个海洋模式,并分析了这两个... 本文用CORE-IAF(Coordinated Ocean-ice Reference Experiments–Interannual Forcing)外强迫场分别强迫LICOM3(LASG/IAP Climate System Ocean Model Version 3)和POP2(Parallel Ocean Program version 2)两个海洋模式,并分析了这两个模式中太平洋北赤道逆流(NECC)的模拟结果。我们发现LICOM3和POP2模拟的NECC强度均弱于实测,这和Sun et al.(2019)的研究结果一致,也进一步证明了海洋模式中NECC偏弱是CORE-IAF外强迫场造成的,海表风应力及对应的风应力旋度是海洋模式准确模拟NECC的最主要因子。同时,我们也分析了NECC的模拟在动力机制上的差别,这里的动力强迫项包括风应力项、平流项和余项。我们发现模式的外强迫场虽然相同,但是两个模式中各动力强迫项(风应力项、平流项和余项)对NECC模拟的影响并不完全相同。 展开更多
关键词 太平洋北赤道逆流 LICOM3(LASG/IAP Climate System Ocean Model Version 3)海洋模式 POP2(parallel Ocean Program version 2)海洋模式 CORE-IAF强迫场(Coordinated Ocean-ice ReferenceExperiments–Interannual Forcing)
下载PDF
CAREFUL NUMERICAL SIMULATION AND ANALYSIS OF MIGRATION-ACCUMULATION OF TANHAI REGION
17
作者 袁益让 杜宁 韩玉笈 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI 2005年第6期741-752,共12页
Numerical simulation of careful parallel arithmetic of oil resources migration-accumulation of Tanhai Region ( three-layer) was done. Careful parallel operator splitting-up implicit iterative scheme, parallel arithmet... Numerical simulation of careful parallel arithmetic of oil resources migration-accumulation of Tanhai Region ( three-layer) was done. Careful parallel operator splitting-up implicit iterative scheme, parallel arithmetic program, parallel arithmetic information and alternating-direction mesh subdivision were put forward. Parallel arithmetic and analysis of different CPU combinations were done. This numerical simulation test and the actual conditions are basically coincident. The convergence estimation of the model problem has successfully solved the difficult problem in the fields of permeation fluid mechanics, computational mathematics and petroleum geology. 展开更多
关键词 migration-accumulation Tanhai region careful numerical simulation parallel arithmetic program numerical analysis
下载PDF
Performance of Text-Independent Automatic Speaker Recognition on a Multicore System
18
作者 Rand Kouatly Talha Ali Khan 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第2期447-456,共10页
This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of t... This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing procedures.Shared memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR algorithm.The experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved. 展开更多
关键词 Automatic Speaker Recognition(ASR) Gaussian Mixture Model(GMM) shared memory parallel programming PThreads OPENMP
原文传递
Study on Parallel Computing 被引量:6
19
作者 陈国良 孙广中 +1 位作者 张云泉 莫则尧 《Journal of Computer Science & Technology》 SCIE EI CSCD 2006年第5期665-673,共9页
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical ba... In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of "architecture algorithm programming application". Only in this way, parallel computing research becomes continuous development and more realistic. 展开更多
关键词 parallel computing parallel architecture parallel programming parallel algorithm parallel application
原文传递
Approach of generating parallel programs from parallelized algorithm design strategies 被引量:4
20
作者 WAN Jian-yi LI Xiao-ying 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2008年第3期128-132,共5页
Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized... Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and paraUelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study. 展开更多
关键词 parallel programming SKELETONS algorithm design strategy parallel architecture
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部