期刊文献+
共找到20篇文章
< 1 >
每页显示 20 50 100
Leaching from Heterogeneous Heck Catalysts:A Computational Approach
1
作者 Peter. M. Jenkins and Shik Chi Tsang Surface Science and Catalysis Research Centre, Department of Chemistry, University of Reading, Whiteknights, Reading, RG6 6AD, UK 《Chemical Research in Chinese Universities》 SCIE CAS CSCD 2002年第2期175-177,共3页
The possibility of carrying out a purely heterogeneous Heck reaction in practice without Pd leaching has been previously considered by a number of research groups but no general consent has yet arrived. Here, the reac... The possibility of carrying out a purely heterogeneous Heck reaction in practice without Pd leaching has been previously considered by a number of research groups but no general consent has yet arrived. Here, the reaction was, for the first time, evaluated by a simple computational approach. Modelling experiments were performed on one of the initial catalytic steps: phenyl halides attachment on Pd (111) to (100) and (111) to (111) ridges of a Pd crystal. Three surface structures of resulting were identified as possible reactive intermediates. Following potential energy minimisation calculations based on a universal force field, the relative stabilities of these surface species were then determined. Results showed the most stable species to be one in which a Pd ridge atom is removed from the Pd crystal structure, suggesting Pd leaching induced by phenyl halides is energetically favourable. 展开更多
关键词 heterogeneous Heck reaction Aryl halides computational Modelling Leaching of palladium
下载PDF
Federated Feature Concatenate Method for Heterogeneous Computing in Federated Learning
2
作者 Wu-Chun Chung Yung-Chin Chang +2 位作者 Ching-Hsien Hsu Chih-Hung Chang Che-Lun Hung 《Computers, Materials & Continua》 SCIE EI 2023年第4期351-371,共21页
Federated learning is an emerging machine learning techniquethat enables clients to collaboratively train a deep learning model withoutuploading raw data to the aggregation server. Each client may be equippedwith diff... Federated learning is an emerging machine learning techniquethat enables clients to collaboratively train a deep learning model withoutuploading raw data to the aggregation server. Each client may be equippedwith different computing resources for model training. The client equippedwith a lower computing capability requires more time for model training,resulting in a prolonged training time in federated learning. Moreover, it mayfail to train the entire model because of the out-of-memory issue. This studyaims to tackle these problems and propose the federated feature concatenate(FedFC) method for federated learning considering heterogeneous clients.FedFC leverages the model splitting and feature concatenate for offloadinga portion of the training loads from clients to the aggregation server. Eachclient in FedFC can collaboratively train a model with different cutting layers.Therefore, the specific features learned in the deeper layer of the serversidemodel are more identical for the data class classification. Accordingly,FedFC can reduce the computation loading for the resource-constrainedclient and accelerate the convergence time. The performance effectiveness isverified by considering different dataset scenarios, such as data and classimbalance for the participant clients in the experiments. The performanceimpacts of different cutting layers are evaluated during the model training.The experimental results show that the co-adapted features have a criticalimpact on the adequate classification of the deep learning model. Overall,FedFC not only shortens the convergence time, but also improves the bestaccuracy by up to 5.9% and 14.5% when compared to conventional federatedlearning and splitfed, respectively. In conclusion, the proposed approach isfeasible and effective for heterogeneous clients in federated learning. 展开更多
关键词 Federated learning deep learning artificial intelligence heterogeneous computing
下载PDF
FPGA Accelerators for Computing Interatomic Potential-Based Molecular Dynamics Simulation for Gold Nanoparticles:Exploring Different Communication Protocols
3
作者 Ankitkumar Patel Srivathsan Vasudevan Satya Bulusu 《Computers, Materials & Continua》 SCIE EI 2024年第9期3803-3818,共16页
Molecular Dynamics(MD)simulation for computing Interatomic Potential(IAP)is a very important High-Performance Computing(HPC)application.MD simulation on particles of experimental relevance takes huge computation time,... Molecular Dynamics(MD)simulation for computing Interatomic Potential(IAP)is a very important High-Performance Computing(HPC)application.MD simulation on particles of experimental relevance takes huge computation time,despite using an expensive high-end server.Heterogeneous computing,a combination of the Field Programmable Gate Array(FPGA)and a computer,is proposed as a solution to compute MD simulation efficiently.In such heterogeneous computation,communication between FPGA and Computer is necessary.One such MD simulation,explained in the paper,is the(Artificial Neural Network)ANN-based IAP computation of gold(Au_(147)&Au_(309))nanoparticles.MD simulation calculates the forces between atoms and the total energy of the chemical system.This work proposes the novel design and implementation of an ANN IAP-based MD simulation for Au_(147)&Au_(309) using communication protocols,such as Universal Asynchronous Receiver-Transmitter(UART)and Ethernet,for communication between the FPGA and the host computer.To improve the latency of MD simulation through heterogeneous computing,Universal Asynchronous Receiver-Transmitter(UART)and Ethernet communication protocols were explored to conduct MD simulation of 50,000 cycles.In this study,computation times of 17.54 and 18.70 h were achieved with UART and Ethernet,respectively,compared to the conventional server time of 29 h for Au_(147) nanoparticles.The results pave the way for the development of a Lab-on-a-chip application. 展开更多
关键词 Ethernet hardware accelerator heterogeneous computing interatomic potential(IAP) MDsimulation peripheral component interconnect express(PCIe) UART
下载PDF
Time Predictable Modeling Method for GPU Architecture with SIMT and Cache Miss Awareness
4
作者 Shaojie Zhang 《Journal of Electronic Research and Application》 2024年第2期109-115,共7页
Graphics Processing Units(GPUs)are used to accelerate computing-intensive tasks,such as neural networks,data analysis,high-performance computing,etc.In the past decade or so,researchers have done a lot of work on GPU ... Graphics Processing Units(GPUs)are used to accelerate computing-intensive tasks,such as neural networks,data analysis,high-performance computing,etc.In the past decade or so,researchers have done a lot of work on GPU architecture and proposed a variety of theories and methods to study the microarchitectural characteristics of various GPUs.In this study,the GPU serves as a co-processor and works together with the CPU in an embedded real-time system to handle computationally intensive tasks.It models the architecture of the GPU and further considers it based on some excellent work.The SIMT mechanism and Cache-miss situation provide a more detailed analysis of the GPU architecture.In order to verify the GPU architecture model proposed in this article,10 GPU kernel_task and an Nvidia GPU device were used to perform experiments.The experimental results showed that the minimum error between the kernel task execution time predicted by the GPU architecture model proposed in this article and the actual measured kernel task execution time was 3.80%,and the maximum error was 8.30%. 展开更多
关键词 heterogeneous computing GPU Architecture modeling Time predictability
下载PDF
Joint Resource Allocation Using Evolutionary Algorithms in Heterogeneous Mobile Cloud Computing Networks 被引量:10
5
作者 Weiwei Xia Lianfeng Shen 《China Communications》 SCIE CSCD 2018年第8期189-204,共16页
The problem of joint radio and cloud resources allocation is studied for heterogeneous mobile cloud computing networks. The objective of the proposed joint resource allocation schemes is to maximize the total utility ... The problem of joint radio and cloud resources allocation is studied for heterogeneous mobile cloud computing networks. The objective of the proposed joint resource allocation schemes is to maximize the total utility of users as well as satisfy the required quality of service(QoS) such as the end-to-end response latency experienced by each user. We formulate the problem of joint resource allocation as a combinatorial optimization problem. Three evolutionary approaches are considered to solve the problem: genetic algorithm(GA), ant colony optimization with genetic algorithm(ACO-GA), and quantum genetic algorithm(QGA). To decrease the time complexity, we propose a mapping process between the resource allocation matrix and the chromosome of GA, ACO-GA, and QGA, search the available radio and cloud resource pairs based on the resource availability matrixes for ACOGA, and encode the difference value between the allocated resources and the minimum resource requirement for QGA. Extensive simulation results show that our proposed methods greatly outperform the existing algorithms in terms of running time, the accuracy of final results, the total utility, resource utilization and the end-to-end response latency guaranteeing. 展开更多
关键词 heterogeneous mobile cloud computing networks resource allocation genetic algorithm ant colony optimization quantum genetic algorithm
下载PDF
Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems 被引量:2
6
作者 杨灿群 吴强 +3 位作者 胡慧俐 石志才 陈娟 唐滔 《Journal of Central South University》 SCIE EI CAS 2013年第6期1527-1535,共9页
Particle-in-cell (PIC) method has got much benefits from GPU-accelerated heterogeneous systems.However,the performance of PIC is constrained by the interpolation operations in the weighting process on GPU (graphic pro... Particle-in-cell (PIC) method has got much benefits from GPU-accelerated heterogeneous systems.However,the performance of PIC is constrained by the interpolation operations in the weighting process on GPU (graphic processing unit).Aiming at this problem,a fast weighting method for PIC simulation on GPU-accelerated systems was proposed to avoid the atomic memory operations during the weighting process.The method was implemented by taking advantage of GPU's thread synchronization mechanism and dividing the problem space properly.Moreover,software managed shared memory on the GPU was employed to buffer the intermediate data.The experimental results show that the method achieves speedups up to 3.5 times compared to previous works,and runs 20.08 times faster on one NVIDIA Tesla M2090 GPU compared to a single core of Intel Xeon X5670 CPU. 展开更多
关键词 GPU computing heterogeneous computing plasma physics simulations particle-in-cell (PIC)
下载PDF
THUBrachy:fast Monte Carlo dose calculation tool accelerated by heterogeneous hardware for high-dose-rate brachytherapy 被引量:1
7
作者 An-Kang Hu Rui Qiu +5 位作者 Huan Liu Zhen Wu Chun-Yan Li Hui Zhang Jun-Li Li Rui-Jie Yang 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2021年第3期107-119,共13页
The Monte Carlo(MC)simulation is regarded as the gold standard for dose calculation in brachytherapy,but it consumes a large amount of computing resources.The development of heterogeneous computing makes it possible t... The Monte Carlo(MC)simulation is regarded as the gold standard for dose calculation in brachytherapy,but it consumes a large amount of computing resources.The development of heterogeneous computing makes it possible to substantially accelerate calculations with hardware accelerators.Accordingly,this study develops a fast MC tool,called THUBrachy,which can be accelerated by several types of hardware accelerators.THUBrachy can simulate photons with energy less than 3 MeV and considers all photon interactions in the energy range.It was benchmarked against the American Association of Physicists in Medicine Task Group No.43 Report using a water phantom and validated with Geant4 using a clinical case.A performance test was conducted using the clinical case,showing that a multicore central processing unit,Intel Xeon Phi,and graphics processing unit(GPU)can efficiently accelerate the simulation.GPU-accelerated THUBrachy is the fastest version,which is 200 times faster than the serial version and approximately 500 times faster than Geant4.The proposed tool shows great potential for fast and accurate dose calculations in clinical applications. 展开更多
关键词 High-dose-rate brachytherapy Monte Carlo heterogeneous computing Hardware accelerators
下载PDF
Resource Scheduling Strategy for Performance Optimization Based on Heterogeneous CPU-GPU Platform 被引量:1
8
作者 Juan Fang Kuan Zhou +1 位作者 Mengyuan Zhang Wei Xiang 《Computers, Materials & Continua》 SCIE EI 2022年第10期1621-1635,共15页
In recent years,with the development of processor architecture,heterogeneous processors including Center processing unit(CPU)and Graphics processing unit(GPU)have become the mainstream.However,due to the differences o... In recent years,with the development of processor architecture,heterogeneous processors including Center processing unit(CPU)and Graphics processing unit(GPU)have become the mainstream.However,due to the differences of heterogeneous core,the heterogeneous system is now facing many problems that need to be solved.In order to solve these problems,this paper try to focus on the utilization and efficiency of heterogeneous core and design some reasonable resource scheduling strategies.To improve the performance of the system,this paper proposes a combination strategy for a single task and a multi-task scheduling strategy for multiple tasks.The combination strategy consists of two sub-strategies,the first strategy improves the execution efficiency of tasks on the GPU by changing the thread organization structure.The second focuses on the working state of the efficient core and develops more reasonable workload balancing schemes to improve resource utilization of heterogeneous systems.The multi-task scheduling strategy obtains the execution efficiency of heterogeneous cores and global task information through the processing of task samples.Based on this information,an improved ant colony algorithm is used to quickly obtain a reasonable task allocation scheme,which fully utilizes the characteristics of heterogeneous cores.The experimental results show that the combination strategy reduces task execution time by 29.13%on average.In the case of processing multiple tasks,the multi-task scheduling strategy reduces the execution time by up to 23.38%based on the combined strategy.Both strategies can make better use of the resources of heterogeneous systems and significantly reduce the execution time of tasks on heterogeneous systems. 展开更多
关键词 heterogeneous computing CPU-GPU PERFORMANCE Workload balance
下载PDF
A new heuristic for task scheduling in heterogeneous computing environment
9
作者 Ehsan Ullah MUNIR Jian-zhong LI +2 位作者 Sheng-fei SHI Zhao-nian ZOU Qaisar RASOOL 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2008年第12期1715-1723,共9页
Heterogeneous computing (HC) environment utilizes diverse resources with different computational capabilities to solve computing-intensive applications having diverse computational requirements and constraints. The ta... Heterogeneous computing (HC) environment utilizes diverse resources with different computational capabilities to solve computing-intensive applications having diverse computational requirements and constraints. The task assignment problem in HC environment can be formally defined as for a given set of tasks and machines, assigning tasks to machines to achieve the minimum makespan. In this paper we propose a new task scheduling heuristic, high standard deviation first (HSTDF), which considers the standard deviation of the expected execution time of a task as a selection criterion. Standard deviation of the ex- pected execution time of a task represents the amount of variation in task execution time on different machines. Our conclusion is that tasks having high standard deviation must be assigned first for scheduling. A large number of experiments were carried out to check the effectiveness of the proposed heuristic in different scenarios, and the comparison with the existing heuristics (Max-min, Sufferage, Segmented Min-average, Segmented Min-min, and Segmented Max-min) clearly reveals that the proposed heuristic outperforms all existing heuristics in terms of average makespan. 展开更多
关键词 heterogeneous computing Task scheduling Greedy heuristics High standard deviation first (HSTDF) heuristic
下载PDF
Efficient Data-parallel Computations on Distributed Systems
10
作者 曾志勇 LU Xinda 《High Technology Letters》 EI CAS 2002年第3期92-96,共5页
Task scheduling determines the performance of NOW computing to a large extent. However, the computer system architecture, computing capability and system load are rarely proposed together. In this paper, a biggest het... Task scheduling determines the performance of NOW computing to a large extent. However, the computer system architecture, computing capability and system load are rarely proposed together. In this paper, a biggest heterogeneous scheduling algorithm is presented. It fully considers the system characteristics (from application view), structure and state. So it always can utilize all processing resource under a reasonable premise. The results of experiment show the algorithm can significantly shorten the response time of jobs. 展开更多
关键词 parallel algorithms heterogeneous computing message passing load balancing
下载PDF
Programming bare-metal accelerators with heterogeneous threading models:a case study of Matrix-3000 被引量:1
11
作者 Jianbin FANG Peng ZHANG +4 位作者 Chun HUANG Tao TANG Kai LU Ruibo WANG Zheng WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第4期509-520,共12页
As the hardware industry moves toward using specialized heterogeneous many-core processors to avoid the effects of the power wall,software developers are finding it hard to deal with the complexity of these systems.In... As the hardware industry moves toward using specialized heterogeneous many-core processors to avoid the effects of the power wall,software developers are finding it hard to deal with the complexity of these systems.In this paper,we share our experience of developing a programming model and its supporting compiler and libraries for Matrix-3000,which is designed for next-generation exascale supercomputers but has a complex memory hierarchy and processor organization.To assist its software development,we have developed a software stack from scratch that includes a low-level programming interface and a high-level OpenCL compiler.Our low-level programming model offers native programming support for using the bare-metal accelerators of Matrix-3000,while the high-level model allows programmers to use the OpenCL programming standard.We detail our design choices and highlight the lessons learned from developing system software to enable the programming of bare-metal accelerators.Our programming models have been deployed in the production environment of an exascale prototype system. 展开更多
关键词 heterogeneous computing Parallel programming models PROGRAMMABILITY COMPILERS Runtime systems
原文传递
Heterogeneous parallel computing accelerated iterative subpixel digital image correlation 被引量:10
12
作者 HUANG JianWen ZHANG LingQi +6 位作者 JIANG ZhenYu DONG ShouBin CHEN Wei LIU YiPing LIU ZeJia ZHOU LiCheng TANG LiQun 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2018年第1期74-85,共12页
Parallel computing techniques have been introduced into digital image correlation(DIC) in recent years and leads to a surge in computation speed. The graphics processing unit(GPU)-based parallel computing demonstrated... Parallel computing techniques have been introduced into digital image correlation(DIC) in recent years and leads to a surge in computation speed. The graphics processing unit(GPU)-based parallel computing demonstrated a surprising effect on accelerating the iterative subpixel DIC, compared with CPU-based parallel computing. In this paper, the performances of the two kinds of parallel computing techniques are compared for the previously proposed path-independent DIC method, in which the initial guess for the inverse compositional Gauss-Newton(IC-GN) algorithm at each point of interest(POI) is estimated through the fast Fourier transform-based cross-correlation(FFT-CC) algorithm. Based on the performance evaluation, a heterogeneous parallel computing(HPC) model is proposed with hybrid mode of parallelisms in order to combine the computing power of GPU and multicore CPU. A scheme of trial computation test is developed to optimize the configuration of the HPC model on a specific computer. The proposed HPC model shows excellent performance on a middle-end desktop computer for real-time subpixel DIC with high resolution of more than 10000 POIs per frame. 展开更多
关键词 digital image correlation(DIC) inverse compositional Gauss-Newton(IC-GN) algorithm heterogeneous parallel computing graphics processing unit(GPU) multicore CPU real-time DIC
原文传递
Network simulation task partition method in heterogeneous computing environment 被引量:1
13
作者 Xiaofeng Wang Wei Zhu Yueming Dai 《International Journal of Modeling, Simulation, and Scientific Computing》 EI 2014年第3期136-152,共17页
To reduce the running time of network simulation in heterogeneous computing environment,a network simulation task partition method,named LBPHCE,is put forward.In this method,the network simulation task is partitioned ... To reduce the running time of network simulation in heterogeneous computing environment,a network simulation task partition method,named LBPHCE,is put forward.In this method,the network simulation task is partitioned in comprehensive consideration of the load balance of both routing computing simulation and packet forwarding simulation.First,through benchmark experiments,the computation ability and routing simulation ability of each simulation machine are measured in the heterogeneous computing environment.Second,based on the computation ability of each simulation machine,the network simulation task is initially partitioned to meet the load balance of packet forwarding simulation in the heterogeneous computing environment,and then according to the routing computation ability,the scale of each partition is fine-tuned to satisfy the balance of the routing computing simulation,meanwhile the load balance of packet forwarding simulation is guaranteed.Experiments based on PDNS indicate that,compared to traditional uniform partition method,the LBPHCE method can reduce the total simulation running time by 26.3%in average,and compared to the liner partition method,it can reduce the running time by 18.3%in average. 展开更多
关键词 Networks simulation distributed simulation heterogeneous computing environments task partition
原文传递
Fast Parallel Cutoff Pair Interactions for Molecular Dynamics on Heterogeneous Systems
14
作者 Qiang Wu Canqun Yang +1 位作者 Tao Tang Kai Lu 《Tsinghua Science and Technology》 EI CAS 2012年第3期265-277,共13页
Heterogeneous systems with both Central Processing Units (CPUs) and Graphics Processing Units (GPUs) are frequently used to accelerate short-ranged Molecular Dynamics (MD) simulations. The most time-consuming ta... Heterogeneous systems with both Central Processing Units (CPUs) and Graphics Processing Units (GPUs) are frequently used to accelerate short-ranged Molecular Dynamics (MD) simulations. The most time-consuming task in short-ranged MD simulations is the computation of particle-to-particle interac- tions. Beyond a certain distance, these interactions decrease to zero. To minimize the operations to investi- gate distance, previous works have tiled interactions by employing the spatial attribute, which increases the memory access and GPU computations, hence decreasing performance. Other studies ignore the spatial attribute and construct an all-versus-all interaction matrix, which has poor scalability. This paper presents an improved algorithm. The algorithm first bins particles into voxels according to the spatial attributes, and then tiles the all-versus-all matrix into voxel-versus-voxel sub-matrixes. Only the sub-matrixes between neighbor- ing voxels are computed on the GPU. Therefore, the algorithm reduces the distance examine operations and limits additional memory access and GPU computations. This paper also adopts a multi-level program- ming model to implement the algorithm on multi-nodes of Tianhe-lA. By employing (1) a patch design to ex- ploit parallelism across the simulation domain, (2) a communication overlapping method to overlap the communications between CPUs and GPUs, and (3) a dynamic workload balancing method to adjust the workloads among compute nodes, the implementation achieves a speedup of 4.16x on one NVIDIA Tesla M2050 GPU compared to a 2.93 GHz six-core Intel Xeon X5670 CPU. In addition, it runs 2.41x faster on 256 compute nodes of Tianhe-lA (with two CPUs and one GPU inside a node) than on 256 GPU-excluded nodes. 展开更多
关键词 cutoff pair interactions molecular dynamics heterogeneous computing GPU computing
原文传递
Cooperating CoScheduling:A Coscheduling Proposal Aimed at Non-Dedicated Heterogeneous NOWs
15
作者 Francesc Giné Francesc Solsona +2 位作者 Mauricio Hanzich Porfidio Hernández Emilio Luque 《Journal of Computer Science & Technology》 SCIE EI CSCD 2007年第5期695-710,共16页
Implicit coscheduling techniques applied to non-dedicated homogeneous Networks Of Workstations (NOWs) have shown they can perform well when many local users compete with a single parallel job. Implicit coscheduling ... Implicit coscheduling techniques applied to non-dedicated homogeneous Networks Of Workstations (NOWs) have shown they can perform well when many local users compete with a single parallel job. Implicit coscheduling deals with minimizing the communication waiting time of parallel processes by identifying the processes in need of coscheduling through gathering and analyzing implicit runtime information, basically communication events. Unfortunately, implicit coscheduling techniques do not guarantee the performance of local and parallel jobs, when the number of parallel jobs competing against each other is increased. Thus, a low efficiency use of the idle computational resources is achieved. In order to solve these problems, a new technique, named Cooperating CoScheduling (CCS), is presented in this work. Unlike traditional implicit coscheduling techniques, under CCS, each node takes its scheduling decisions from the occurrence of local events, basically communication, memory, Input/Output and CPU, together with foreign events received from cooperating nodes. This allows CCS to provide a social contract based on reserving a percentage of CPU and memory resources to ensure the progress of parallel jobs without disturbing the local users, while coscheduling of communicating tasks is ensured. Besides, the CCS algorithm uses status information from the cooperating nodes to balance the resources across the cluster when necessary. Experimental results in a non-dedicated heterogeneous NOW reveal that CCS allows the idle resources to be exploited efficiently, thus obtaining a satisfactory speedup and provoking an overhead that is imperceptible to the local user. 展开更多
关键词 job scheduling non-dedicated and heterogeneous NOW computing resource allocation
原文传递
Numerical Study of Geometric Multigrid Methods on CPU–GPU Heterogeneous Computers
16
作者 Chunsheng Feng Shi Shu +1 位作者 Jinchao Xu Chen-Song Zhang 《Advances in Applied Mathematics and Mechanics》 SCIE 2014年第1期1-23,共23页
.The geometric multigrid method(GMG)is one of the most efficient solving techniques for discrete algebraic systems arising from elliptic partial differential equations.GMG utilizes a hierarchy of grids or discretizati... .The geometric multigrid method(GMG)is one of the most efficient solving techniques for discrete algebraic systems arising from elliptic partial differential equations.GMG utilizes a hierarchy of grids or discretizations and reduces the error at a number of frequencies simultaneously.Graphics processing units(GPUs)have recently burst onto the scientific computing scene as a technology that has yielded substantial performance and energy-efficiency improvements.A central challenge in implementing GMG on GPUs,though,is that computational work on coarse levels cannot fully utilize the capacity of a GPU.In this work,we perform numerical studies of GMG on CPU–GPU heterogeneous computers.Furthermore,we compare our implementation with an efficient CPU implementation of GMG and with the most popular fast Poisson solver,Fast Fourier Transform,in the cuFFT library developed by NVIDIA. 展开更多
关键词 High-performance computing CPU–GPU heterogeneous computers multigrid method fast Fourier transform partial differential equations.
原文传递
Smart data deduplication for telehealth systems in heterogeneous cloud computing
17
作者 GAI Keke QIU Meikang +1 位作者 SUN Xiaotong ZHAO Hui 《Journal of Communications and Information Networks》 2016年第4期93-104,共12页
The widespread application of heterogeneous cloud computing has enabled enormous advances in the real-time performance of telehealth systems.A cloud-based telehealth system allows healthcare users to obtain medical da... The widespread application of heterogeneous cloud computing has enabled enormous advances in the real-time performance of telehealth systems.A cloud-based telehealth system allows healthcare users to obtain medical data from various data sources supported by heterogeneous cloud providers.Employing data duplications in distributed cloud databases is an alternative approach for achieving data sharing among multiple data users.However,this approach results in additional storage space being used,even though reducing data duplications would lead to a decrease in data acquisitions and real-time performance.To address this issue,this paper focuses on developing a dynamic data deduplication method that uses an intelligent blocker to determine the working mode of data duplications for each data package in heterogeneous cloud-based telehealth systems.The proposed approach is named the SD2M(Smart Data Deduplication Model),in which the main algorithm applies dynamic programming to produce optimal solutions to minimizing the total cost of data usage.We implement experimental evaluations to examine the adaptability of the proposed approach. 展开更多
关键词 data deduplication TELEHEALTH heterogeneous cloud computing optimal solution dynamic programming
原文传递
Resource scheduling techniques in cloud from a view of coordination:a holistic survey 被引量:1
18
作者 Yuzhao WANG Junqing YU Zhibin YU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第1期1-40,共40页
Nowadays,the management of resource contention in shared cloud remains a pending problem.The evolution and deployment of new application paradigms(e.g.,deep learning training and microservices)and custom hardware(e.g.... Nowadays,the management of resource contention in shared cloud remains a pending problem.The evolution and deployment of new application paradigms(e.g.,deep learning training and microservices)and custom hardware(e.g.,graphics processing unit(GPU)and tensor processing unit(TPU))have posed new challenges in resource management system design.Current solutions tend to trade cluster efficiency for guaranteed application performance,e.g.,resource over-allocation,leaving a lot of resources underutilized.Overcoming this dilemma is not easy,because different components across the software stack are involved.Nevertheless,massive efforts have been devoted to seeking effective performance isolation and highly efficient resource scheduling.The goal of this paper is to systematically cover related aspects to deliver the techniques from the coordination perspective,and to identify the corresponding trends they indicate.Briefly,four topics are involved.First,isolation mechanisms deployed at different levels(micro-architecture,system,and virtualization levels)are reviewed,including GPU multitasking methods.Second,resource scheduling techniques within an individual machine and at the cluster level are investigated,respectively.Particularly,GPU scheduling for deep learning applications is described in detail.Third,adaptive resource management including the latest microservice-related research is thoroughly explored.Finally,future research directions are discussed in the light of advanced work.We hope that this review paper will help researchers establish a global view of the landscape of resource management techniques in shared cloud,and see technology trends more clearly. 展开更多
关键词 COORDINATION CO-LOCATION heterogeneous computing Microservice Resource scheduling techniques
原文传递
Computing infrastructure for big data processing 被引量:7
19
作者 Ling LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2013年第2期165-170,共6页
With computing systems undergone a fundamen- tal transformation from single-processor devices at the turn of the century to the ubiquitous and networked devices and the warehouse-scale computing via the cloud, the par... With computing systems undergone a fundamen- tal transformation from single-processor devices at the turn of the century to the ubiquitous and networked devices and the warehouse-scale computing via the cloud, the parallelism has become ubiquitous at many levels. At micro level, par- allelisms are being explored from the underlying circuits, to pipelining and instruction level parallelism on multi-cores or many cores on a chip as well as in a machine. From macro level, parallelisms are being promoted from multiple ma- chines on a rack, many racks in a data center, to the glob- ally shared infrastructure of the Internet. With the push of big data, we are entering a new era of parallel computing driven by novel and ground breaking research innovation on elas- tic parallelism and scalability. In this paper, we will give an overview of computing infrastructure for big data processing, focusing on architectural, storage and networking challenges of supporting big data paper. We will briefly discuss emerging computing infrastructure and technologies that are promising for improving data parallelism, task parallelism and encour- aging vertical and horizontal computation parallelism. 展开更多
关键词 big data cloud computing data analytics elas-tic scalability heterogeneous computing GPU PCM bigdata processing
原文传递
Task assignment for minimizing application completion time using honeybee mating optimization
20
作者 Qinma KANG Hong HE 《Frontiers of Computer Science》 SCIE EI CSCD 2013年第3期404-415,共12页
Effective task assignment is essential for achieving high performance in heterogeneous distributed computing systems. This paper proposes a new technique for minimizing the parallel application time cost of task assig... Effective task assignment is essential for achieving high performance in heterogeneous distributed computing systems. This paper proposes a new technique for minimizing the parallel application time cost of task assignment based on the honeybee mating optimization (HBMO) algorithm. The HBMO approach combines the power of simulated annealing, genetic algorithms, and an effective local search heuristic to find the best possible solution to the problem within an acceptable amount of computation time. The performance of the proposed HBMO algorithm is shown by comparing it with three existing task assignment techniques on a large number of randomly generated problem instances. Experimental results indicate that the proposed HBMO algorithm outperforms the competing algorithms. 展开更多
关键词 heterogeneous distributed computing task assignment task interaction graph honeybee mating optimization META-HEURISTICS
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部