期刊文献+
共找到399篇文章
< 1 2 20 >
每页显示 20 50 100
Multi-core optimization for conjugate gradient benchmark on heterogeneous processors
1
作者 邓林 窦勇 《Journal of Central South University》 SCIE EI CAS 2011年第2期490-498,共9页
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at t... Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores. 展开更多
关键词 multi-core processor NAS parallelization CG memory optimization
下载PDF
System Support for Parallel Computing on Heterogeneous Networks of Workstations 被引量:2
2
作者 Xiaodong Zhang(High Performance Computing and Software Laboratory University of Texas at San Antonio San Antonio, Texas 78249, U .S .A.) 《Wuhan University Journal of Natural Sciences》 CAS 1996年第Z1期362-370,共9页
Abstract In this paper, we introduce several on-going research projects to support parallel and distribut,ed computing on heterogeneous networks of workstations (NOW) in the High Performance Computing and Software Lah... Abstract In this paper, we introduce several on-going research projects to support parallel and distribut,ed computing on heterogeneous networks of workstations (NOW) in the High Performance Computing and Software Lahoratory at the University of Texas at San Antonio. The projects at aiming at addressing three technical issues. First, the factors of heterogeneity and time-sharing effects make traditional performance models/metrics for homogeneous computing performance measurement and evaluation not. suitable for bet.erogeneous computing. We develop practical models and metrics which quantify. the heterogeneity of networks and characterize the performance effects. Second, in order to perform parallel computation effectively, special system support is necessary. We are developing system schemes for heterogeneity management, process scheduling and efficient communications. Finally, to provide insight into system performance, we are developing two types of supporting tools : a graphical instrumentation monitor to aid users in investigating performance problems and in determining the most effective way of exploiting the NOW systems, and a trace-driven simulator to test and compare different system management and scheduling schemes. 展开更多
关键词 parallel SUPPORT SYSTEM heterogeneous COMPUTING
下载PDF
Performance Enhancement of XML Parsing Using Regression and Parallelism
3
作者 Muhammad Ali Minhaj Ahmad Khan 《Computer Systems Science & Engineering》 2024年第2期287-303,共17页
The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj... The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files. 展开更多
关键词 Regression parallel parsing multi-cores XML
下载PDF
Programming for scientific computing on peta-scale heterogeneous parallel systems 被引量:1
4
作者 杨灿群 吴强 +2 位作者 唐滔 王锋 薛京灵 《Journal of Central South University》 SCIE EI CAS 2013年第5期1189-1203,共15页
Peta-scale high-perfomlance computing systems are increasingly built with heterogeneous CPU and GPU nodes to achieve higher power efficiency and computation throughput. While providing unprecedented capabilities to co... Peta-scale high-perfomlance computing systems are increasingly built with heterogeneous CPU and GPU nodes to achieve higher power efficiency and computation throughput. While providing unprecedented capabilities to conduct computational experiments of historic significance, these systems are presently difficult to program. The users, who are domain experts rather than computer experts, prefer to use programming models closer to their domains (e.g., physics and biology) rather than MPI and OpenME This has led the development of domain-specific programming that provides domain-specific programming interfaces but abstracts away some performance-critical architecture details. Based on experience in designing large-scale computing systems, a hybrid programming framework for scientific computing on heterogeneous architectures is proposed in this work. Its design philosophy is to provide a collaborative mechanism for domain experts and computer experts so that both domain-specific knowledge and performance-critical architecture details can be adequately exploited. Two real-world scientific applications have been evaluated on TH-IA, a peta-scale CPU-GPU heterogeneous system that is currently the 5th fastest supercomputer in the world. The experimental results show that the proposed framework is well suited for developing large-scale scientific computing applications on peta-scale heterogeneous CPU/GPU systems. 展开更多
关键词 heterogeneous parallel system programming framework scientific computing GPU computing molecular dynamic
下载PDF
A Survey on Task Scheduling of CPU-GPU Heterogeneous Cluster
5
作者 ZHOU Yiheng ZENG Wei +2 位作者 ZHENG Qingfang LIU Zhilong CHEN Jianping 《ZTE Communications》 2024年第3期83-90,共8页
This paper reviews task scheduling frameworks,methods,and evaluation metrics of central processing unit-graphics processing unit(CPU-GPU)heterogeneous clusters.Task scheduling of CPU-GPU heterogeneous clusters can be ... This paper reviews task scheduling frameworks,methods,and evaluation metrics of central processing unit-graphics processing unit(CPU-GPU)heterogeneous clusters.Task scheduling of CPU-GPU heterogeneous clusters can be carried out on the system level,nodelevel,and device level.Most task-scheduling technologies are heuristic based on the experts’experience,while some technologies are based on statistic methods using machine learning,deep learning,or reinforcement learning.Many metrics have been adopted to evaluate and compare different task scheduling technologies that try to optimize different goals of task scheduling.Although statistic task scheduling has reached fewer research achievements than heuristic task scheduling,the statistic task scheduling still has significant research potential. 展开更多
关键词 CPU-GPU heterogeneous cluster task scheduling heuristic task scheduling statistic task scheduling parallelIZATION
下载PDF
A New Hybrid Hierarchical Parallel Algorithm to Enhance the Performance of Large-Scale Structural Analysis Based on Heterogeneous Multicore Clusters
6
作者 Gaoyuan Yu Yunfeng Lou +2 位作者 Hang Dong Junjie Li Xianlong Jin 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期135-155,共21页
Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation ... Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation is still a problem in large-scale structural analysis based on heterogeneousmulticore clusters.To solve it,a hybrid hierarchical parallel algorithm(HHPA)is proposed on the basis of the conventional domain decomposition algorithm(CDDA)and the parallel sparse solver.In this new algorithm,a three-layer parallelization of the computational procedure is introduced to enable the separation of the communication of inter-nodes,heterogeneous-core-groups(HCGs)and inside-heterogeneous-core-groups through mapping computing tasks to various hardware layers.This approach can not only achieve load balancing at different layers efficiently but can also improve the communication rate significantly through hierarchical communication.Additionally,the proposed hybrid parallel approach in this article can reduce the interface equation size and further reduce the solution time,which can make up for the shortcoming of growing communication overheads with the increase of interface equation size when employing CDDA.Moreover,the distributed sparse storage of a large amount of data is introduced to improve memory access.By solving benchmark instances on the Shenwei-Taihuzhiguang supercomputer,the results show that the proposed method can obtain higher speedup and parallel efficiency compared with CDDA and more superior extensibility of parallel partition compared with the two-level parallel computing algorithm(TPCA). 展开更多
关键词 heterogeneous multicore hybrid parallel finite element analysis domain decomposition
下载PDF
Parallel scheduling strategy of web-based spatial computing tasks in multi-core environment
7
作者 郭明强 Huang Ying Xie Zhong 《High Technology Letters》 EI CAS 2014年第4期395-400,共6页
In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of pa... In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of parallel processing mechanisms.One is that it can evenly allocate tasks to each server node in the cluster and the other is that it can implement the load balancing inside a server node.Based on the strategy,a new web-based spatial computing model is designed in this paper,in which,a task response ratio calculation method,a request queue buffer mechanism and a thread scheduling strategy are focused on.Experimental results show that the new model can fully use the multi-core computing advantage of each server node in the concurrent access environment and improve the average hits per second,average I/O Hits,CPU utilization and throughput.Using speed-up ratio to analyze the traditional model and the new one,the result shows that the new model has the best performance.The performance of the multi-core server nodes in the cluster is optimized;the resource utilization and the parallel processing capabilities are enhanced.The more CPU cores you have,the higher parallel processing capabilities will be obtained. 展开更多
关键词 parallel scheduling strategy the web-based spatial computing model multi-core environment load balancing
下载PDF
Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on Multi-Core Processor
8
作者 Zhang Ziran,Li Jun,Li Changxiao(ZTE Corporation,Shenzhen 518057,P.R.China) 《ZTE Communications》 2009年第1期54-58,共5页
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co... The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well. 展开更多
关键词 CORE LTE parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on multi-core Processor Design
下载PDF
Design and Implementation of a Heterogeneous Database System over the Intranet
9
作者 姚领众 邢艳辉 +3 位作者 宋瀚涛 杨楠 郭贵锁 孙丹 《Journal of Beijing Institute of Technology》 EI CAS 1998年第1期92-99,共8页
Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction... Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction management and parallel query over the Intranet Methods In the designing and implementation of HDBUS two important concepts heterogeneous tables join. Results and Conclu- tion The first concept can be used to process the parallel query of multiple database server, the second one is the key technology of heterogeneous is the key technology of heterogeneous distribute database. 展开更多
关键词 INTRANET heterogeneous database parallel bridge heterogeneous tables join
下载PDF
Influence of heterogeneity on rock strength and stiffness using discrete element method and parallel bond model 被引量:8
10
作者 Spyridon Liakas Catherine O’Sullivan Charalampos Saroglou 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2017年第4期575-584,共10页
The particulate discrete element method(DEM) can be employed to capture the response of rock,provided that appropriate bonding models are used to cement the particles to each other.Simulations of laboratory tests are ... The particulate discrete element method(DEM) can be employed to capture the response of rock,provided that appropriate bonding models are used to cement the particles to each other.Simulations of laboratory tests are important to establish the extent to which those models can capture realistic rock behaviors.Hitherto the focus in such comparison studies has either been on homogeneous specimens or use of two-dimensional(2D) models.In situ rock formations are often heterogeneous,thus exploring the ability of this type of models to capture heterogeneous material behavior is important to facilitate their use in design analysis.In situ stress states are basically three-dimensional(3D),and therefore it is important to develop 3D models for this purpose.This paper revisits an earlier experimental study on heterogeneous specimens,of which the relative proportions of weaker material(siltstone) and stronger,harder material(sandstone) were varied in a controlled manner.Using a 3D DEM model with the parallel bond model,virtual heterogeneous specimens were created.The overall responses in terms of variations in strength and stiffness with different percentages of weaker material(siltstone) were shown to agree with the experimental observations.There was also a good qualitative agreement in the failure patterns observed in the experiments and the simulations,suggesting that the DEM data enabled analysis of the initiation of localizations and micro fractures in the specimens. 展开更多
关键词 Discrete element method(DEM) heterogeneous rocks Strength and stiffness parallel bond model
下载PDF
Scheduling algorithm based on critical tasks in heterogeneous environments 被引量:4
11
作者 Lan Zhou Sun Shixin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2008年第2期398-404,F0003,共8页
Heterogeneous computing is one effective method of high performance computing with many advantages. Task scheduling is a critical issue in heterogeneous environments as well as in homogeneous environments. A number of... Heterogeneous computing is one effective method of high performance computing with many advantages. Task scheduling is a critical issue in heterogeneous environments as well as in homogeneous environments. A number of task scheduling algorithms for homogeneous environments have been proposed, whereas, a few for heterogeneous environments can be found in the literature. A novel task scheduling algorithm for heterogeneous environments, called the heterogeneous critical task (HCT) scheduling algorithm is presented. By means of the directed acyclic graph and the gantt graph, the HCT algorithm defines the critical task and the idle time slot. After determining the critical tasks of a given task, the HCT algorithm tentatively duplicates the critical tasks onto the processor that has the given task in the idle time slot, to reduce the start time of the given task. To compare the performance of the HCT algorithm with several recently proposed algorithms, a large set of randomly generated applications and the Gaussian elimination application are randomly generated. The experimental result has shown that the HCT algorithm outperforms the other algorithm. 展开更多
关键词 list scheduling task duplication task graphs heterogeneous environment parallel processing.
下载PDF
An MPI parallel DEM-IMB-LBM framework for simulating fluid-solid interaction problems 被引量:2
12
作者 Ming Xia Liuhong Deng +3 位作者 Fengqiang Gong Tongming Qu Y.T.Feng Jin Yu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2219-2231,共13页
The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive comp... The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework. 展开更多
关键词 Discrete element method(DEM) Lattice Boltzmann method(LBM) Immersed moving boundary(IMB) multi-cores parallelization Message passing interface(MPI) CPU Submarine landslides
下载PDF
An Improved Model for Computing-Intensive Tasks on Heterogeneous Workstations
13
作者 邬延辉 陆鑫达 《Journal of Shanghai Jiaotong university(Science)》 EI 2004年第2期6-9,15,共5页
An improved algorithm, which solves cooperative concurrent computing tasks using the idle cycles of a number of high performance heterogeneous workstations interconnected through a high-speed network, was proposed. In... An improved algorithm, which solves cooperative concurrent computing tasks using the idle cycles of a number of high performance heterogeneous workstations interconnected through a high-speed network, was proposed. In order to get better parallel computation performance, this paper gave a model and an algorithm of task scheduling among heterogeneous workstations, in which the costs of loading data, computing, communication and collecting results are considered. Using this efficient algorithm, an optimal subset of heterogeneous workstations with the shortest parallel executing time of tasks can be selected. 展开更多
关键词 heterogeneous parallel computing cooperative concurrent computing scheduling
下载PDF
HOPE:a heterogeneity-oriented parallel execution engine for inference on mobiles
14
作者 XIA Chunwei ZHAO Jiacheng +1 位作者 CUI Huimin FENG Xiaobing 《High Technology Letters》 EI CAS 2022年第4期363-372,共10页
It is significant to efficiently support artificial intelligence(AI)applications on heterogeneous mobile platforms,especially coordinately execute a deep neural network(DNN)model on multiple computing devices of one m... It is significant to efficiently support artificial intelligence(AI)applications on heterogeneous mobile platforms,especially coordinately execute a deep neural network(DNN)model on multiple computing devices of one mobile platform.This paper proposes HOPE,an end-to-end heterogeneous inference framework running on mobile platforms to distribute the operators in a DNN model to different computing devices.The problem is formalized into an integer linear programming(ILP)problem and a heuristic algorithm is proposed to determine the near-optimal heterogeneous execution plan.The experimental results demonstrate that HOPE can reduce up to 36.2%inference latency(with an average of 22.0%)than MOSAIC,22.0%(with an average of 10.2%)than StarPU and 41.8%(with an average of 18.4%)thanμLayer respectively. 展开更多
关键词 deep neural network(DNN) mobile heterogeneous scheduler parallel computing
下载PDF
Heterogeneities of grain boundary contact for simulation of laboratoryscale mechanical behavior of granitic rocks
15
作者 Xiongyu Hu Marte Gutierrez Zhiwei Yan 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第7期2629-2644,共16页
From a practical point of view,grain structure heterogeneities are key parameters that control the rock response and still remains a challenge to incorporate in a quantitative manner.One of the less discussed topics i... From a practical point of view,grain structure heterogeneities are key parameters that control the rock response and still remains a challenge to incorporate in a quantitative manner.One of the less discussed topics in the context of the grain-based model(GBM)in the particle flow code(PFC)is the contact heterogeneities and the appropriate contact model to mimic the grain boundary behavior.Generally,the smooth joint(SJ)model and linear parallel bond(LPB)model are used to simulate the grain boundary behavior.However,the literature does not document the suitability of different models for specific problems.Another challenge in implementing GBM in PFC is that only a single bonding parameter is used at the grain boundaries.The aim of this study is to investigate the responses of a laboratory-scale specimen with SJ and LPB models,considering grain boundary heterogeneous and homogeneous contact parameters.Uniaxial and biaxial compression tests are performed to calibrate the response of Creighton granite.The stressestrain curves,volumetric dilation,inter-crack(crack in the grain boundary),and intra-crack(crack within the grain)development,and failure patterns associated with different contact models are examined.It was found that both the SJ and LPB models can reproduce the pre-peak behavior observed for a granitic rock type.However,the LPB model is unable to reproduce the post-peak behavior.Due to the large interlocking effect originating from the balls in contact and the ball size in the LPB model,local dilation is induced at the grain boundaries.This overestimates the volumetric dilation and residual shear strength.The LPB model tends to result in discontinuous inter-cracks and stress localization in the rock specimen,resulting in fine fragments at the rock surface during failure. 展开更多
关键词 Grain boundary contact Smooth joint(SJ)model Linear parallel bond(LPB)model Contact heterogeneities Particle flow code(PFC) Granitic rock
下载PDF
面向国产异构众核系统的Parallel C语言设计与实现 被引量:10
16
作者 何王全 刘勇 +2 位作者 方燕飞 魏迪 漆锋滨 《软件学报》 EI CSCD 北大核心 2017年第4期764-785,共22页
异构众核架构具有超高的性能功耗比,已成为超级计算机体系结构的重要发展方向.但众核系统更为复杂的并行层次和存储层次,给编程和优化带来了极大的挑战.因此,研究面向众核系统的并行编程技术,对于降低国产众核系统并行应用的编程难度、... 异构众核架构具有超高的性能功耗比,已成为超级计算机体系结构的重要发展方向.但众核系统更为复杂的并行层次和存储层次,给编程和优化带来了极大的挑战.因此,研究面向众核系统的并行编程技术,对于降低国产众核系统并行应用的编程难度、提升并行程序的性能都具有重要的意义.提出统一架构的多模式并行编程模型,包括异构融合的加速运算模型和按同构方式编程的自主运算模型,根据编程模型设计了Parallel C语言,能够有效地描述国产众核系统的异构并行性.与其他众核系统上MPI+X的使用模式相比,编程和系统优化都具有全局视角,在多级局部性描述、单边消息、兼容已有多核应用等方面具有特色;基于Open64构建了Parallel C编译系统,全面支持加速运算模型和自主运算模型,提出并实现了数据布局与自动DMA、编译指导的线程代理和拓扑位置感知的集合通信等优化.Micro Benchmark和实际应用在神威太湖之光计算机系统上的测试数据结果表明:Parallel C语言和编译系统具有良好的性能和可扩展性,能够有效支撑大型应用. 展开更多
关键词 异构众核 编程模型 并行语言 parallel C 编译器 消息传递
下载PDF
PARALLEL IMPLEMENTATION AND OPTIMIZATION OF THE SEBVHOS ALGORITHM 被引量:2
17
作者 Li Wen Guo Li Yuan Hongxing Wei Yifang Guan Hua 《Journal of Electronics(China)》 2011年第3期277-283,共7页
In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, ... In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, and then several performance optimization techniques which are loop optimization, cache optimization, false sharing optimization, synchronization overhead op-timization, and thread affinity optimization, are used to improve the implementation's performance on multi-core systems. The performance of the parallel SEBVHOS algorithm is analyzed on a dual-core system. The experimental results show that the parallel SEBVHOS algorithm achieves an average of 1.86x speedup. More importantly, our method does not come with additional aliasing artifacts, com-paring to the original SEBVHOS algorithm. 展开更多
关键词 multi-core parallel algorithm Performance optimization 3D reconstruction
下载PDF
LOW-COST HIGH PERFORMANCE CLUSTER OF WORK-STATIONS BASED ON DYNAMIC LOAD BALANCING FOR PARALLEL DEPTH-FIRST SEARCH(DFS)
18
作者 Mohammed A. +2 位作者 M.Ibrahim(加力) LU Xin-da(陆鑫达) 《Journal of Shanghai Jiaotong university(Science)》 EI 2002年第2期223-226,共4页
This paper presented an idea to replace the traditionally expensive parallel machines by heterogeneous cluster of workstations. To emphasise the usability of cluster of workstations platform for parallel and distribut... This paper presented an idea to replace the traditionally expensive parallel machines by heterogeneous cluster of workstations. To emphasise the usability of cluster of workstations platform for parallel and distributed computing, also the paper presented the status report on the effort and experiences for the implementation of a dynamic load balancing for parallel tree computation depth first search(DFS) on the cluster of a workstations project. It compared the speedup performance obtained from our platform with that obtained from the traditional one. The speedup results show that cluster of workstations can be a serious alternative to the expensive parallel machines. 展开更多
关键词 heterogeneous CLUSTERS of WORKSTATION parallel tree computation DFS dynamic load balancing strategy parallel performance
下载PDF
A Parallel Distributed FEM Computing Circumstance Based on CORBA and Java Techniqiues
19
作者 孔祥安 詹剑峰 袁峰 《Journal of Modern Transportation》 2000年第2期136-144,共9页
Based on CORBA (Common Object Request Broker Architect ) and Java techniques, a concrete solution to creating a parallel distributed FEM computing circumstance (PDFCC) on the platform of heterogeneous networks support... Based on CORBA (Common Object Request Broker Architect ) and Java techniques, a concrete solution to creating a parallel distributed FEM computing circumstance (PDFCC) on the platform of heterogeneous networks supporting TGP/IP protocol is proposed. In order to verify the feasibility of this solution, the basic frame of PDFCC has been implemented and tested on LAN (Local Area Network). 展开更多
关键词 heterogeneous network circumstance parallel distribution FEM CORBA JAVA
下载PDF
YHFT-QDSP:High-Performance Heterogeneous Multi-Core DSP
20
作者 陈书明 万江华 +8 位作者 鲁建壮 刘仲 孙海燕 孙永节 刘衡竹 刘祥远 李振涛 徐毅 陈小文 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第2期214-224,共11页
Multi-core architectures are widely used to in time-to-market and power consumption of the chips enhance the microprocessor performance within a limited increase Toward the application of high-density data signal pro... Multi-core architectures are widely used to in time-to-market and power consumption of the chips enhance the microprocessor performance within a limited increase Toward the application of high-density data signal processing, this paper presents a novel heterogeneous multi-core architecture digital signal processor (DSP), YHFT-QDSP, with one RISC CPU core and 4 VLIW DSP cores. By three kinds of interconnection, YHFT-QDSP provides high efficiency message communication for inner-chip RISC core and DSP cores, inner-chip and inter-chip DSP cores. A parallel programming platform is specifically developed for the heterogeneous nmlti-core architecture of YHFT-QDSP. This parallel programming environment provides a parallel support library and a friendly interface between high level application softwares and multi- core DSP. The 130 nm CMOS custom chip design results benchmarks show that the interconnection structure of in a high speed and moderate power design. The results of typical YHFT-QDSP is much better than other related structures and achieves better speedup when using the interconnection facilities in combing methods. YHFT-QDSP has been signed off and manufactured presently. The future applications of the multi-core chip could be found in 3G wireless base station, high performance radar, industrial applications, and so on. 展开更多
关键词 digital signal processor (DSP) multi-core ARCHITECTURE parallel programming custom design
原文传递
上一页 1 2 20 下一页 到第
使用帮助 返回顶部