期刊文献+
共找到1,173篇文章
< 1 2 59 >
每页显示 20 50 100
Optimization Techniques for GPU-Based Parallel Programming Models in High-Performance Computing
1
作者 Shuntao Tang Wei Chen 《信息工程期刊(中英文版)》 2024年第1期7-11,共5页
This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g... This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology. 展开更多
关键词 Optimization Techniques GPU-Based parallel programming Models High-Performance Computing
下载PDF
Parallel Control for Optimal Tracking via Adaptive Dynamic Programming 被引量:23
2
作者 Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第6期1662-1674,共13页
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int... This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases. 展开更多
关键词 Adaptive dynamic programming(ADP) nonlinear optimal control parallel controller parallel control theory parallel system tracking control neural network(NN)
下载PDF
Scheduling Step-Deteriorating Jobs on Parallel Machines by Mixed Integer Programming 被引量:4
3
作者 郭鹏 程文明 +1 位作者 曾鸣 梁剑 《Journal of Donghua University(English Edition)》 EI CAS 2015年第5期709-714,719,共7页
Production scheduling has a major impact on the productivity of the manufacturing process. Recently, scheduling problems with deteriorating jobs have attracted increasing attentions from researchers. In many practical... Production scheduling has a major impact on the productivity of the manufacturing process. Recently, scheduling problems with deteriorating jobs have attracted increasing attentions from researchers. In many practical situations,it is found that some jobs fail to be processed prior to the pre-specified thresholds,and they often consume extra deteriorating time for successful accomplishment. Their processing times can be characterized by a step-wise function. Such kinds of jobs are called step-deteriorating jobs. In this paper,parallel machine scheduling problem with stepdeteriorating jobs( PMSD) is considered. Due to its intractability,four different mixed integer programming( MIP) models are formulated for solving the problem under consideration. The study aims to investigate the performance of these models and find promising optimization formulation to solve the largest possible problem instances. The proposed four models are solved by commercial software CPLEX. Moreover,the near-optimal solutions can be obtained by black-box local-search solver LocalS olver with the fourth one. The computational results show that the efficiencies of different MIP models depend on the distribution intervals of deteriorating thresholds, and the performance of LocalS olver is clearly better than that of CPLEX in terms of the quality of the solutions and the computational time. 展开更多
关键词 parallel machine step-deterioration mixed integer programming(MIP) scheduling models total completion time
下载PDF
Programming for scientific computing on peta-scale heterogeneous parallel systems 被引量:1
4
作者 杨灿群 吴强 +2 位作者 唐滔 王锋 薛京灵 《Journal of Central South University》 SCIE EI CAS 2013年第5期1189-1203,共15页
Peta-scale high-perfomlance computing systems are increasingly built with heterogeneous CPU and GPU nodes to achieve higher power efficiency and computation throughput. While providing unprecedented capabilities to co... Peta-scale high-perfomlance computing systems are increasingly built with heterogeneous CPU and GPU nodes to achieve higher power efficiency and computation throughput. While providing unprecedented capabilities to conduct computational experiments of historic significance, these systems are presently difficult to program. The users, who are domain experts rather than computer experts, prefer to use programming models closer to their domains (e.g., physics and biology) rather than MPI and OpenME This has led the development of domain-specific programming that provides domain-specific programming interfaces but abstracts away some performance-critical architecture details. Based on experience in designing large-scale computing systems, a hybrid programming framework for scientific computing on heterogeneous architectures is proposed in this work. Its design philosophy is to provide a collaborative mechanism for domain experts and computer experts so that both domain-specific knowledge and performance-critical architecture details can be adequately exploited. Two real-world scientific applications have been evaluated on TH-IA, a peta-scale CPU-GPU heterogeneous system that is currently the 5th fastest supercomputer in the world. The experimental results show that the proposed framework is well suited for developing large-scale scientific computing applications on peta-scale heterogeneous CPU/GPU systems. 展开更多
关键词 heterogeneous parallel system programming framework scientific computing GPU computing molecular dynamic
下载PDF
PDP: Parallel Dynamic Programming 被引量:15
5
作者 Fei-Yue Wang Jie Zhang +2 位作者 Qinglai Wei Xinhu Zheng Li Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第1期1-5,共5页
Deep reinforcement learning is a focus research area in artificial intelligence. The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods. The principle of adaptive ... Deep reinforcement learning is a focus research area in artificial intelligence. The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods. The principle of adaptive dynamic programming ADP is first presented instead of direct dynamic programming DP , and the inherent relationship between ADP and deep reinforcement learning is developed. Next, analytics intelligence, as the necessary requirement, for the real reinforcement learning, is discussed. Finally, the principle of the parallel dynamic programming, which integrates dynamic programming and analytics intelligence, is presented as the future computational intelligence. © 2014 Chinese Association of Automation. 展开更多
关键词 Artificial intelligence Neural networks Reinforcement learning
下载PDF
PARALLEL MULTIPLICATIVE ITERATIVE METHODS FOR CONVEX PROGRAMMING
6
作者 陈忠 费浦生 《Acta Mathematica Scientia》 SCIE CSCD 1997年第2期205-210,共6页
In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the ... In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the algorithms. The proofs are essentially based on the results of sequential methods shown by Eggermontt[1]. 展开更多
关键词 parallel algorithm convex programming
下载PDF
Grid Service Framework: Supporting Multi-Models Parallel Grid Programming
7
作者 邓倩妮 陆鑫达 《Journal of Shanghai Jiaotong university(Science)》 EI 2004年第1期56-59,共4页
Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platfor... Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing. 展开更多
关键词 web service volunteer computing grid computing parallel programming garadigm
下载PDF
Stochastic Programming Model for Discrete Lotsizing and Scheduling Problem on Parallel Machines
8
作者 Kensuke Ishiwata Jun Imaizumi +1 位作者 Takayuki Shiina Susumu Morito 《American Journal of Operations Research》 2012年第3期374-381,共8页
In recent years, it has been difficult for manufactures and suppliers to forecast demand from a market for a given product precisely. Therefore, it has become important for them to cope with fluctuations in demand. Fr... In recent years, it has been difficult for manufactures and suppliers to forecast demand from a market for a given product precisely. Therefore, it has become important for them to cope with fluctuations in demand. From this viewpoint, the problem of planning or scheduling in production systems can be regarded as a mathematical problem with stochastic elements. However, in many previous studies, such problems are formulated without stochastic factors, treating stochastic elements as deterministic variables or parameters. Stochastic programming incorporates such factors into the mathematical formulation. In the present paper, we consider a multi-product, discrete, lotsizing and scheduling problem on parallel machines with stochastic demands. Under certain assumptions, this problem can be formulated as a stochastic integer programming problem. We attempt to solve this problem by a scenario aggregation method proposed by Rockafellar and Wets. The results from computational experiments suggest that our approach is able to solve large-scale problems, and that, under the condition of uncertainty, incorporating stochastic elements into the model gives better results than formulating the problem as a deterministic model. 展开更多
关键词 STOCHASTIC programming Lotsizing and Scheduling parallel MACHINES SCENARIO AGGREGATION Method
下载PDF
Optimal Redundancy Allocation in Hierarchical Series-Parallel Systems Using Mixed Integer Programming
9
作者 Mohsen Ziaee 《Applied Mathematics》 2013年第1期79-83,共5页
Reliability optimization plays an important role in design, operation and management of the industrial systems. System reliability can be easily enhanced by improving the reliability of unreliable components and/or by... Reliability optimization plays an important role in design, operation and management of the industrial systems. System reliability can be easily enhanced by improving the reliability of unreliable components and/or by using redundant configuration with subsystems/components in parallel. Redundancy Allocation Problem (RAP) was studied in this research. A mixed integer programming model was proposed to solve the problem, which considers simultaneously two objectives under several resource constraints. The model is only for the hierarchical series-parallel systems in which the elements of any subset of subsystems or components are connected in series or parallel and constitute a larger subsystem or total system. At the end of the study, the performance of the proposed approach was evaluated by a numerical example. 展开更多
关键词 HIERARCHICAL SERIES-parallel System Optimal REDUNDANCY ALLOCATION Mixed INTEGER programming FORMULATION Reliability Optimization
下载PDF
A Neuron-Oriented Programming System 被引量:3
10
作者 李涛 《High Technology Letters》 EI CAS 2001年第1期70-73,共4页
A neruon-oriented programming system based on parallel neural information processing has been presented. With the neural programming system built upon 4~8 process elements(TMS C30), the system has thus provided users... A neruon-oriented programming system based on parallel neural information processing has been presented. With the neural programming system built upon 4~8 process elements(TMS C30), the system has thus provided users high speed, general purpose and large scale neural network application development platforms etc. 展开更多
关键词 Neural networks parallel processing programming system
下载PDF
Programming System PARCS
11
作者 A. V. Anisimov A. V. Derevianchenko +1 位作者 P. P. Kuliabko O. M. Fedorus 《Journal of Computer and Communications》 2017年第9期129-139,共11页
PARCS (Parallel Asynchronous Recursive Control System) programming tools that allow unified add-on parallel extensions over traditional programming languages are described. The PARCS model is based on the conception o... PARCS (Parallel Asynchronous Recursive Control System) programming tools that allow unified add-on parallel extensions over traditional programming languages are described. The PARCS model is based on the conception of a control space, which is used to describe parallel interacting processes. Structurally, the control space consists of addressable “points” and “channels”. Executing modules are assigned to points and communicate through channels connecting points. Recursive embeddings of processes are allowed. The effective implementation of PARCS on cloud platforms Microsoft AZURE and Amazon EC2 is also presented. 展开更多
关键词 parallel COMPUTING programming LANGUAGE Control SPACE CLOUD COMPUTING
下载PDF
A Trace-state Based Approach to Specification and Design of Parallel Programs
12
作者 He Jifeng Oxford University Computing LaboratoryProgramming Research Group Parks Road, Oxford OXl 3QD, England 《计算机工程》 CAS CSCD 北大核心 1996年第S1期91-105,共15页
In this paper they deal with the issue of specification and design of parallel communicatingprocesses. A trace-state based model is introduced to describe the behaviour of concurrent programs. They presenta formal sys... In this paper they deal with the issue of specification and design of parallel communicatingprocesses. A trace-state based model is introduced to describe the behaviour of concurrent programs. They presenta formal system based on that model to achieve hierarchical and modular development and verification methods. Anumber of refinement rules are used to decompose the specification into smaller ones and calculate program fromthe 展开更多
关键词 COMM A Trace-state Based Approach to Specification and Design of parallel programs
下载PDF
Parallel Image Processing: Taking Grayscale Conversion Using OpenMP as an Example
13
作者 Bayan AlHumaidan Shahad Alghofaily +2 位作者 Maitha Al Qhahtani Sara Oudah Naya Nagy 《Journal of Computer and Communications》 2024年第2期1-10,共10页
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl... In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks. 展开更多
关键词 parallel Computing Image Processing OPENMP parallel programming High Performance Computing GPU (Graphic Processing Unit)
下载PDF
面向国产异构众核系统的Parallel C语言设计与实现 被引量:10
14
作者 何王全 刘勇 +2 位作者 方燕飞 魏迪 漆锋滨 《软件学报》 EI CSCD 北大核心 2017年第4期764-785,共22页
异构众核架构具有超高的性能功耗比,已成为超级计算机体系结构的重要发展方向.但众核系统更为复杂的并行层次和存储层次,给编程和优化带来了极大的挑战.因此,研究面向众核系统的并行编程技术,对于降低国产众核系统并行应用的编程难度、... 异构众核架构具有超高的性能功耗比,已成为超级计算机体系结构的重要发展方向.但众核系统更为复杂的并行层次和存储层次,给编程和优化带来了极大的挑战.因此,研究面向众核系统的并行编程技术,对于降低国产众核系统并行应用的编程难度、提升并行程序的性能都具有重要的意义.提出统一架构的多模式并行编程模型,包括异构融合的加速运算模型和按同构方式编程的自主运算模型,根据编程模型设计了Parallel C语言,能够有效地描述国产众核系统的异构并行性.与其他众核系统上MPI+X的使用模式相比,编程和系统优化都具有全局视角,在多级局部性描述、单边消息、兼容已有多核应用等方面具有特色;基于Open64构建了Parallel C编译系统,全面支持加速运算模型和自主运算模型,提出并实现了数据布局与自动DMA、编译指导的线程代理和拓扑位置感知的集合通信等优化.Micro Benchmark和实际应用在神威太湖之光计算机系统上的测试数据结果表明:Parallel C语言和编译系统具有良好的性能和可扩展性,能够有效支撑大型应用. 展开更多
关键词 异构众核 编程模型 并行语言 parallel C 编译器 消息传递
下载PDF
The parallel 3D magnetotelluric forward modeling algorithm 被引量:28
15
作者 Tan Handong Tong Tuo Lin Changhong 《Applied Geophysics》 SCIE CSCD 2006年第4期197-202,共6页
The workload of the 3D magnetotelluric forward modeling algorithm is so large that the traditional serial algorithm costs an extremely large compute time. However, the 3D forward modeling algorithm can process the dat... The workload of the 3D magnetotelluric forward modeling algorithm is so large that the traditional serial algorithm costs an extremely large compute time. However, the 3D forward modeling algorithm can process the data in the frequency domain, which is very suitable for parallel computation. With the advantage of MPI and based on an analysis of the flow of the 3D magnetotelluric serial forward algorithm, we suggest the idea of parallel computation and apply it. Three theoretical models are tested and the execution efficiency is compared in different situations. The results indicate that the parallel 3D forward modeling computation is correct and the efficiency is greatly improved. This method is suitable for large size geophysical computations. 展开更多
关键词 Magnetotelluric 3D forward modeling MPI parallel programming design 3D staggered-grid finite difference method parallel algorithm.
下载PDF
Parallel Dispatch:A New Paradigm of Electrical Power System Dispatch 被引量:5
16
作者 Jun Jason Zhang Fei-Yue Wang +5 位作者 Qiang Wang Dazhi Hao Xiaojing Yang David Wenzhong Gao Xiangyang Zhao Yingchen Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期311-319,共9页
Modern power systems are evolving into sociotechnical systems with massive complexity, whose real-time operation and dispatch go beyond human capability. Thus,the need for developing and applying new intelligent power... Modern power systems are evolving into sociotechnical systems with massive complexity, whose real-time operation and dispatch go beyond human capability. Thus,the need for developing and applying new intelligent power system dispatch tools are of great practical significance. In this paper, we introduce the overall business model of power system dispatch, the top level design approach of an intelligent dispatch system, and the parallel intelligent technology with its dispatch applications. We expect that a new dispatch paradigm,namely the parallel dispatch, can be established by incorporating various intelligent technologies, especially the parallel intelligent technology, to enable secure operation of complex power grids,extend system operators' capabilities, suggest optimal dispatch strategies, and to provide decision-making recommendations according to power system operational goals. 展开更多
关键词 ACP knowledge automation power dispatch parallel dynamic programming parallel intelligence paralle learning situational awareness
下载PDF
Rapid Optimization of Tension Distribution for Cable-Driven Parallel Manipulators with Redundant Cables 被引量:8
17
作者 OUYANG Bo SHANG Weiwei 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2016年第2期231-238,共8页
The solution of tension distributions is infinite for cable-driven parallel manipulators(CDPMs) with redundant cables. A rapid optimization method for determining the optimal tension distribution is presented. The n... The solution of tension distributions is infinite for cable-driven parallel manipulators(CDPMs) with redundant cables. A rapid optimization method for determining the optimal tension distribution is presented. The new optimization method is primarily based on the geometry properties of a polyhedron and convex analysis. The computational efficiency of the optimization method is improved by the designed projection algorithm, and a fast algorithm is proposed to determine which two of the lines are intersected at the optimal point. Moreover, a method for avoiding the operating point on the lower tension limit is developed. Simulation experiments are implemented on a six degree-of-freedom(6-DOF) CDPM with eight cables, and the results indicate that the new method is one order of magnitude faster than the standard simplex method. The optimal distribution of tension distribution is thus rapidly established on real-time by the proposed method. 展开更多
关键词 cable-driven parallel manipulator tension distribution redundant cable linear programming
下载PDF
Scheduling and Subcontracting under Parallel Machines 被引量:2
18
作者 陈荣军 唐国春 《Chinese Quarterly Journal of Mathematics》 CSCD 2012年第4期590-597,共8页
In this paper,we study a model on joint decisions of scheduling and subcontracting, in which jobs(orders) can be either processed by parallel machines at the manufacturer in-house or subcontracted to a subcontractor.T... In this paper,we study a model on joint decisions of scheduling and subcontracting, in which jobs(orders) can be either processed by parallel machines at the manufacturer in-house or subcontracted to a subcontractor.The manufacturer needs to determine which jobs should be produced in-house and which jobs should be subcontracted.Furthermore,it needs to determine a production schedule for jobs to be produced in-house.We discuss five classical scheduling objectives as production costs.For each problem with different objective functions,we give optimality conditions and propose dynamic programming algorithms. 展开更多
关键词 SCHEDULING SUBCONTRACTING dynamic programming parallel machines
下载PDF
An Approach to Parallelization of SIFT Algorithm on GPUs for Real-Time Applications 被引量:4
19
作者 Raghu Raj Prasanna Kumar Suresh Muknahallipatna John McInroy 《Journal of Computer and Communications》 2016年第17期18-50,共33页
Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible fo... Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible for single threaded im-plementation to extract local feature descriptors for high-resolution images in real time. In this paper, an approach to parallelization of the SIFT algorithm is demonstrated using NVIDIA’s Graphics Processing Unit (GPU). The parallel-ization design for SIFT on GPUs is divided into two stages, a) Algorithm de-sign-generic design strategies which focuses on data and b) Implementation de-sign-architecture specific design strategies which focuses on optimally using GPU resources for maximum occupancy. Increasing memory latency hiding, eliminating branches and data blocking achieve a significant decrease in aver-age computational time. Furthermore, it is observed via Paraver tools that our approach to parallelization while optimizing for maximum occupancy allows GPU to execute memory bound SIFT algorithm at optimal levels. 展开更多
关键词 Scale Invariant Feature Transform (SIFT) parallel Computing GPU GPU Occupancy Portable parallel programming CUDA
下载PDF
Parallelization of a Branch and Bound Algorithm on Multicore Systems 被引量:1
20
作者 Chia-Shin Chung James Flynn Janche Sang 《Journal of Software Engineering and Applications》 2012年第8期621-629,共9页
The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In... The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems. 展开更多
关键词 parallel Branch and BOUND Multithreaded programming MULTICORE System PERMUTATION FLOWSHOP Software REUSE
下载PDF
上一页 1 2 59 下一页 到第
使用帮助 返回顶部