期刊文献+
共找到2,231篇文章
< 1 2 112 >
每页显示 20 50 100
The Implementation of Ray Tracing Algorithm with OpenMP Parallelization
1
作者 Noor Alnasser Raghad Alabssi +2 位作者 Batool Faran Latifah Alessa Naya Nagy 《Journal of Computer and Communications》 2024年第1期120-130,共11页
Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calcula... Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calculation of lighting and reflecting impact [2]. As ray tracing is a time-consuming process, the need for parallelization to solve this problem arises. One downside of this solution is the existence of race conditions. In this work, we explore and experiment with a different, well-known solution for this race condition. Starting with the introduction and the background section, a brief overview of the topic is followed by a detailed part of how the race conditions may occur in the case of the ray tracing algorithm. Continuing with the methods and results section, we have used OpenMP to parallelize the Ray tracing algorithm with the different compiler directives critical, atomic, and first-private. Hence, it concluded that both critical and atomic are not efficient solutions to produce a good-quality picture, but first-private succeeded in producing a high-quality picture. 展开更多
关键词 parallelization Ray Tracing Parallel Computer Architecture OPENMP
下载PDF
Parallelization of intra prediction algorithm based on array processor 被引量:5
2
作者 Zhu Yun Jiang Lin +2 位作者 Shi Pengfei Xie Xiaoyan Shen Xubang 《High Technology Letters》 EI CAS 2019年第1期74-80,共7页
For the characteristics of intra prediction algorithms, the data dependence and parallelism between intra prediction models are first analyzed. This paper proposes a parallelization method based on dynamic reconfigura... For the characteristics of intra prediction algorithms, the data dependence and parallelism between intra prediction models are first analyzed. This paper proposes a parallelization method based on dynamic reconfigurable array processors provided by the project team, and uses data level parallel(DLP) algorithms in multi-core units. The experimental results show that Y-component of peak signal to noise ratio(Y-PSNR) is improved about 10 dB and the time is saved 63% compared with high-efficiency video coding(HEVC) test model HM10.0. This method can effectively reduce codec time of the video and reduce computational complexity. 展开更多
关键词 high-efficiency video coding(HEVC) intra prediction parallelization mapping
下载PDF
Space decomposition based parallelization solutions for the combined finiteediscrete element method in 2D 被引量:4
3
作者 T.Lukas G.G.Schiava D'Albano A.Munjiza 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2014年第6期607-615,共9页
The combined finiteediscrete element method (FDEM) belongs to a family of methods of computationalmechanics of discontinua. The method is suitable for problems of discontinua, where particles aredeformable and can f... The combined finiteediscrete element method (FDEM) belongs to a family of methods of computationalmechanics of discontinua. The method is suitable for problems of discontinua, where particles aredeformable and can fracture or fragment. The applications of FDEM have spread over a number of disciplinesincluding rock mechanics, where problems like mining, mineral processing or rock blasting canbe solved by employing FDEM. In this work, a novel approach for the parallelization of two-dimensional(2D) FDEM aiming at clusters and desktop computers is developed. Dynamic domain decompositionbased parallelization solvers covering all aspects of FDEM have been developed. These have beenimplemented into the open source Y2D software package and have been tested on a PC cluster. Theoverall performance and scalability of the parallel code have been studied using numerical examples. Theresults obtained confirm the suitability of the parallel implementation for solving large scale problems. 2014 Institute of Rock and Soil Mechanics, Chinese Academy of Sciences. Production and hosting byElsevier B.V. All rights reserved. 展开更多
关键词 parallelization Load balancing PC cluster Combined finiteediscrete element method(FDEM)
下载PDF
A simplified hardware-friendly contour prediction algorithm in 3D-HEVC and parallelization design 被引量:1
4
作者 JIANG Lin DUAN Xueyao XIE Xiaoyan 《High Technology Letters》 EI CAS 2022年第4期392-400,共9页
After the extension of depth modeling mode 4(DMM-4)in 3D high efficiency video coding(3D-HEVC),the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted.To re... After the extension of depth modeling mode 4(DMM-4)in 3D high efficiency video coding(3D-HEVC),the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted.To reduce the computational complexity of DMM-4,a simplified hardware-friendly contour prediction algorithm is proposed in this paper.Based on the similarity between texture and depth map,the proposed algorithm directly codes depth blocks to calculate edge regions to reduce the number of reference blocks.Through the verification of the test sequence on HTM16.1,the proposed algorithm coding time is reduced by 9.42%compared with the original algorithm.To avoid the time consuming of serial coding on HTM,a parallelization design of the proposed algorithm based on reconfigurable array processor(DPR-CODEC)is proposed.The parallelization design reduces the storage access time,configuration time and saves the storage cost.Verified with the Xilinx Virtex 6 FPGA,experimental results show that parallelization design is capable of processing HD 1080p at a speed above 30 frames per second.Compared with the related work,the scheme reduces the LUTs by 42.3%,the REG by 85.5%and the hardware resources by 66.7%.The data loading speedup ratio of parallel scheme can reach 3.4539.On average,the different sized templates serial/parallel speedup ratio of encoding time can reach 2.446. 展开更多
关键词 depth modeling mode 4(DMM-4) contour prediction 3D high efficiency video coding(3D-HEVC) parallelization reconfigurable array processor
下载PDF
ANALYSIS OF MULTIGRID PARALLELIZATION ON MESSAGE PASSING COMPUTERS
5
作者 ZH.ENG-QUAN XUi and NENG-CHAO WANGZ(Department of Computer Science. Deprtment Of MathematicsHuazhong University of Science and Technology430074 Wuhan, Hubei, Peopleis Republic of China) 《Wuhan University Journal of Natural Sciences》 CAS 1996年第Z1期686-691,共6页
This paper studies the;complexity of multighd mpllelization on message PaSsing computers. Parallelization is by domain decomposition. An optimal strip decomposition is proposed. With natural ordering of the grid point... This paper studies the;complexity of multighd mpllelization on message PaSsing computers. Parallelization is by domain decomposition. An optimal strip decomposition is proposed. With natural ordering of the grid points,the strip decomposition leads to good processor utilization. The efficiency could be significantly improved. Better performances could be achieved by making use of Van der Vorst ordering. 展开更多
关键词 multigrain method parallelization COMPLEXITY efficiency.
下载PDF
Parallelization and I/O Performance Optimization of a Global Nonhydrostatic Dynamical Core Using MPI
6
作者 Tiejun Wang Liu Zhuang +2 位作者 Julian MKunkel Shu Xiao Changming Zhao 《Computers, Materials & Continua》 SCIE EI 2020年第6期1399-1413,共15页
The Global-Regional Integrated forecast System(GRIST)is the next-generation weather and climate integrated model dynamic framework developed by Chinese Academy of Meteorological Sciences.In this paper,we present sever... The Global-Regional Integrated forecast System(GRIST)is the next-generation weather and climate integrated model dynamic framework developed by Chinese Academy of Meteorological Sciences.In this paper,we present several changes made to the global nonhydrostatic dynamical(GND)core,which is part of the ongoing prototype of GRIST.The changes leveraging MPI and PnetCDF techniques were targeted at the parallelization and performance optimization to the original serial GND core.Meanwhile,some sophisticated data structures and interfaces were designed to adjust flexibly the size of boundary and halo domains according to the variable accuracy in parallel context.In addition,the I/O performance of PnetCDF decreases as the number of MPI processes increases in our experimental environment.Especially when the number exceeds 6000,it caused system-wide outages(SWO).Thus,a grouping solution was proposed to overcome that issue.Several experiments were carried out on the supercomputing platform based on Intel x86 CPUs in the National Supercomputing Center in Wuxi.The results demonstrated that the parallel GND core based on grouping solution achieves good strong scalability and improves the performance significantly,as well as avoiding the SWOs. 展开更多
关键词 MPI parallelization performance optimization global nonhydrostatic dynamical core
下载PDF
Wedge template optimization and parallelization of depth map in intra-frame prediction algorithms
7
作者 Xie Xiaoyan Wang Yu +3 位作者 Shi Pengfei Zhu Yun Deng Junyong Zhao Huan 《High Technology Letters》 EI CAS 2021年第4期430-439,共10页
To reduce the computational complexity and storage cost caused by wedge segmentation algorithm,a scheme of simplifying wedge matching is proposed.It takes advantage of the correlation of the wedge separation line of d... To reduce the computational complexity and storage cost caused by wedge segmentation algorithm,a scheme of simplifying wedge matching is proposed.It takes advantage of the correlation of the wedge separation line of depth map and the direction of intra-prediction for 3D high-efficiency video coding(3D-HEVC).According to the difference of wedge segmentation between adjacent edge and opposite edge,a set only including 104×4 wedgelet templates is given.By expanding of the wedge wave of a certain minimum unit,a simple separation line acquisition method for different size of depth block is put forward.Furthermore,based on the array processor(DPR-CODEC)developed by project team,an efficient parallel scheme of the improved wedge segmentation mode prediction is introduced.By the scheme,prediction unit(PU)size can be changed randomly from 4×4 to 8×8,16×16,and 32×32,which is more in line with the needs of the HEVC standard.Veri-fied with test sequence in HTM16.1 and the Xilinx virtex-6 field programmable gate array(FPGA)respectively,the experiment results show that the proposed methods save 99.2%of the storage space and 63.94%of the encoding time,the serial/parallel acceleration ratio of each template reaches 1.84 in average.The coding performance,storage and resource consumption are considered for both. 展开更多
关键词 3D high-efficiency video coding(3D-HEVC) wedge segmentation simplified search template parallelization depth model mode(DMM)
下载PDF
A Parallelization Research for FY Satellite Rainfall Estimate Day Knock off Product Algorithm
8
作者 Weixia Lin Xiangang Zhao +2 位作者 Cunqun Fan Manyun Lin Lizi Xie 《Atmospheric and Climate Sciences》 2018年第2期248-261,共14页
With the development of satellite remote sensing technology, more and more requirements are put forward on the timeliness and stability of the satellite weather service system. The FY satellite rainfall estimate day k... With the development of satellite remote sensing technology, more and more requirements are put forward on the timeliness and stability of the satellite weather service system. The FY satellite rainfall estimate day knock off product algorithm runs longer, about 20 minutes, which affects the estimated rainfall product generated timeliness. Research and development of parallel optimization algorithms based on the needs of satellite meteorological services and their effectiveness in practical applications are necessary ways to enhance the high-performance and high-availability capabilities of satellite meteorological services. So aiming at this problem, we started the parallel algorithm research based on the analysis of precipitation estimation algorithm. Firstly, we explained the steps of precipitation estimated date knock off product algorithm;secondly, we analyzed the four main calculation module calculating the amount of algorithms;thirdly, multithreaded parallel algorithm and MPI parallelization was designed. Finally, the multithreaded parallel and MPI parallelization were realized. Experimental results show that the multithreaded parallel and MPI parallelization algorithm could greatly improve the overall degree of computational efficiency. And, MPI parallelization mode has a higher operating efficiency. The performance of parallel processing is closely related to the architecture of the computer. From the perspective of service scheduling and product algorithms, the MPI parallelization approach is adopted to achieve the purpose of improving service quality. 展开更多
关键词 RAINFALL ESTIMATE parallelization MULTITHREADING MPI
下载PDF
Parallelization and performance tuning of molecular dynamics code with OpenMP 被引量:3
9
作者 白树仁 冉丽萍 鲁奎麟 《Journal of Central South University of Technology》 2006年第3期260-264,共5页
An OpenMP approach was proposed to parallelize the sequential molecular dynamics(MD) code on shared memory machines. When a code is converted from the sequential form to the parallel form, data dependence is a main pr... An OpenMP approach was proposed to parallelize the sequential molecular dynamics(MD) code on shared memory machines. When a code is converted from the sequential form to the parallel form, data dependence is a main problem. A traditional sequential molecular dynamics code is anatomized to find the data dependence segments in it, and the two different methods, i.e., recover method and backward mapping method were used to eliminate those data dependencies in order to realize the parallelization of this sequential MD code. The performance of the parallelized MD code was analyzed by using some performance analysis tools. The results of the test show that the computing size of this code increases sharply form 1 million atoms before parallelization to 20 million atoms after parallelization, and the wall clock during computing is reduced largely. Some hot-spots in this code are found and optimized by improved algorithm. The efficiency of parallel computing is 30% higher than that of before, and the calculation time is saved and larger scale calculation problems are solved. 展开更多
关键词 system analysis molecular dynamics parallel computing performance tuning OPENMP
下载PDF
An Approach to Parallelization of SIFT Algorithm on GPUs for Real-Time Applications 被引量:4
10
作者 Raghu Raj Prasanna Kumar Suresh Muknahallipatna John McInroy 《Journal of Computer and Communications》 2016年第17期18-50,共33页
Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible fo... Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible for single threaded im-plementation to extract local feature descriptors for high-resolution images in real time. In this paper, an approach to parallelization of the SIFT algorithm is demonstrated using NVIDIA’s Graphics Processing Unit (GPU). The parallel-ization design for SIFT on GPUs is divided into two stages, a) Algorithm de-sign-generic design strategies which focuses on data and b) Implementation de-sign-architecture specific design strategies which focuses on optimally using GPU resources for maximum occupancy. Increasing memory latency hiding, eliminating branches and data blocking achieve a significant decrease in aver-age computational time. Furthermore, it is observed via Paraver tools that our approach to parallelization while optimizing for maximum occupancy allows GPU to execute memory bound SIFT algorithm at optimal levels. 展开更多
关键词 Scale Invariant Feature Transform (SIFT) Parallel Computing GPU GPU Occupancy Portable Parallel Programming CUDA
下载PDF
Parallelization of a Branch and Bound Algorithm on Multicore Systems 被引量:1
11
作者 Chia-Shin Chung James Flynn Janche Sang 《Journal of Software Engineering and Applications》 2012年第8期621-629,共9页
The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In... The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems. 展开更多
关键词 Parallel Branch and BOUND Multithreaded Programming MULTICORE System PERMUTATION FLOWSHOP Software REUSE
下载PDF
Implementation of OpenMP Parallelization of Rate-Dependent Ceramic Peridynamic Model
12
作者 Haoran Zhang Yaxun Liu +3 位作者 Lisheng Liu Xin Lai Qiwen Liu Hai Mei 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第10期195-217,共23页
A rate-dependent peridynamic ceramic model,considering the brittle tensile response,compressive plastic softening and strain-rate dependence,can accurately represent the dynamic response and crack propagation of ceram... A rate-dependent peridynamic ceramic model,considering the brittle tensile response,compressive plastic softening and strain-rate dependence,can accurately represent the dynamic response and crack propagation of ceramic materials.However,it also considers the strain-rate dependence and damage accumulation caused by compressive plastic softening during the compression stage,requiring more computational resources for the bond force evaluation and damage evolution.Herein,the OpenMP parallel optimization of the rate-dependent peridynamic ceramicmodel is investigated.Also,themodules that compute the interactions betweenmaterial points and update damage index are vectorized and parallelized.Moreover,the numerical examples are carried out to simulate the dynamic response and fracture of the ceramic plate under normal impact.Furthermore,the speed-up ratio and computational efficiency by multi-threads are evaluated and discussed to demonstrate the reliability of parallelized programs.The results reveal that the totalwall clock time has been significantly reduced after optimization,showing the promise of parallelization process in terms of accuracy and stability. 展开更多
关键词 Ceramic penetration behavior rate-dependent peridynamic model OPENMP parallel computing
下载PDF
Comparative Study of the Parallelization of the Smith-Waterman Algorithm on OpenMP and Cuda C
13
作者 Amadou Chaibou Oumarou Sie 《Journal of Computer and Communications》 2015年第6期107-117,共11页
In this paper, we present parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman’s algorithm for sequence alignment. This algorithm, well known in bioinfo... In this paper, we present parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman’s algorithm for sequence alignment. This algorithm, well known in bioinformatics for its applications, is unfortunately time-consuming on a serial computer. We use formulation based on anti-diagonals structure of data. This representation focuses on parallelizable parts of the algorithm without changing the initial formulation of the algorithm. Approaching data in that way give us a formulation more flexible. To examine this approach, we encode it in OpenMP and Cuda C. The performance obtained shows the interest of our paper. 展开更多
关键词 CUDA GP-GPU OPENMP PARALLEL COMPUTING Smith-Waterman
下载PDF
Parallelization of Diagnostics for Climate Model Development
14
作者 Jim McEnerney Sasha Ames +6 位作者 Cameron Christensen Charles Doutriaux Tony Hoang Jeff Painter Brian Smith Zeshawn Shaheen Dean Williams 《Journal of Software Engineering and Applications》 2016年第5期199-207,共9页
The parallelization of the diagnostics for climate research has been an important goal in the performance testing and improvement of the diagnostics for the Department of Energy’s (DOE’s) Accelerated Climate Modelin... The parallelization of the diagnostics for climate research has been an important goal in the performance testing and improvement of the diagnostics for the Department of Energy’s (DOE’s) Accelerated Climate Modeling for Energy (ACME) project [1]. The primary mission of the ACME project is to build and test the next-generation Earth system model for current and future generations of computing systems operated by the DOE office of science computing facilities, including the envisioned exascale systems foreseen in the early part of the next decade. As part of the underpinning workflow environment, a diagnostics, model metrics, and intercomparison Python framework, called UVC Metrics was created to aid in testing and production execution of the model. This framework builds on common methods and similar metrics to accommodate and diagnose individual component models, such as atmosphere, land, ocean, sea ice, and land ice. This paper reports on initial parallelization of UVC Metrics for the atmosphere model component using two popular frameworks: MPI and SPARK. A timing study is presented to assess the performance of each method in which significant improvement was achieved for both frameworks despite I/O contentions with NFS. The advantages and disadvantages of each framework are also presented. 展开更多
关键词 Climate Diagnostics Parallel MPI SPARK
下载PDF
Parallelization of pseudo-particle modeling and its application in simulating gas-solid fluidization 被引量:2
15
作者 Jianxin Lu Jiayuan Zhang +2 位作者 Xiaowei Wang Limin Wang Wei Ge 《Particuology》 SCIE EI CAS CSCD 2009年第4期317-323,共7页
Pseudo-Particle Modeling (PPM) is a particle method proposed by Ge and Li in 1996 [Ge, W., & Li, J. (1996). Pseudo-particle approach to hydrodynamics of particle-fluid systems, in M. Kwauk & J. Li (Eds.), Proc... Pseudo-Particle Modeling (PPM) is a particle method proposed by Ge and Li in 1996 [Ge, W., & Li, J. (1996). Pseudo-particle approach to hydrodynamics of particle-fluid systems, in M. Kwauk & J. Li (Eds.), Proceedings of the 5th international conference on drculating fluidized bed (pp. 260-265). Beijing: Science Press] and has been used to explore the microscopic mechanism in complex particle-fluid systems. But as a particle method, high computational cost remains a main obstacle for its large-scale application; therefore, parallel implementation of this method is highly desirable. Parallelization of two-dimensional PPM was carried out by spatial decomposition in this paper. The time costs of the major functions in the program were analyzed and the program was then optimized for higher efficiency by dynamic load balancing and resetting of particle arrays. Finally, simulation on a gas-solid fluidized bed with 102,400 solid particles and 1.8 × 10^7 pseudo-particles was performed successfully with this code, indicating its scalability in future applications. 展开更多
关键词 parallelization Pseudo-particle modeling Gas-solid fluidization Dynamic load balancing
原文传递
A modular parallelization framework for power flow transfer analysis of large-scale power systems 被引量:3
16
作者 Chuntian CHENG Bin LUO +1 位作者 Jianjian SHEN Shengli LIAO 《Journal of Modern Power Systems and Clean Energy》 SCIE EI 2018年第4期679-690,共12页
Power flow transfer(PFT) analysis under various anticipated faults in advance is important for securing power system operations. In China, PSD-BPA software is the most widely used tool for power system analysis, but i... Power flow transfer(PFT) analysis under various anticipated faults in advance is important for securing power system operations. In China, PSD-BPA software is the most widely used tool for power system analysis, but its input/output interface is easily adapted for PFT analysis,which is also difficult due to its computationally intensity.To solve this issue, and achieve a fast and accurate PFT analysis, a modular parallelization framework is developed in this paper. Two major contributions are included. One is several integrated PFT analysis modules, including parameter initialization, fault setting, network integrity detection, reasonableness identification and result analysis.The other is a parallelization technique for enhancing computation efficiency using a Fork/Join framework. The proposed framework has been tested and validated by the IEEE 39 bus reference power system. Furthermore, it has been applied to a practical power network with 11052 buses and 12487 branches in the Yunnan Power Grid ofChina, providing decision support for large-scale power system analysis. 展开更多
关键词 POWER FLOW TRANSFER MODULAR parallelization Fork/Join FRAMEWORK PSD-BPA
原文传递
Adaptive data-driven parallelization of multi-view video coding on multi-core processor 被引量:2
17
作者 PANG Yi HU WeiDong SUN LiFeng YANG ShiQiang 《Science in China(Series F)》 2009年第2期195-205,共11页
Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi... Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi-core processors, the huge computational requirement of MVC currently prohibits its wide use in consumer markets. In this paper, we demonstrate the design and implementation of the first parallel MVC system on Cell Broadband Engine^TM processor which is a state-of-the-art multi-core processor. We propose a task-dispatching algorithm which is adaptive data-driven on the frame level for MVC, and implement a parallel multi-view video decoder with modified H.264/AVC codec on real machine. This approach provides scalable speedup (up to 16 times on sixteen cores) through proper local store management, utilization of code locality and SIMD improvement. Decoding speed, speedup and utilization rate of cores are expressed in experimental results. 展开更多
关键词 adaptive data-driven multi-view video coding Cell Broadband Engine^TM Processor parallelization
原文传递
Flow and heat transfer characteristics of regenerative cooling parallel channel
18
作者 JU Yinchao LIU Xiaoyong +1 位作者 XU Guoqiang DONG Bensi 《推进技术》 北大核心 2025年第1期163-171,共9页
Due to the complex high-temperature characteristics of hydrocarbon fuel,the research on the long-term working process of parallel channel structure under variable working conditions,especially under high heat-mass rat... Due to the complex high-temperature characteristics of hydrocarbon fuel,the research on the long-term working process of parallel channel structure under variable working conditions,especially under high heat-mass ratio,has not been systematically carried out.In this paper,the heat transfer and flow characteristics of related high temperature fuels are studied by using typical engine parallel channel structure.Through numeri⁃cal simulation and systematic experimental verification,the flow and heat transfer characteristics of parallel chan⁃nels under typical working conditions are obtained,and the effectiveness of high-precision calculation method is preliminarily established.It is known that the stable time required for hot start of regenerative cooling engine is about 50 s,and the flow resistance of parallel channel structure first increases and then decreases with the in⁃crease of equivalence ratio(The following equivalence ratio is expressed byΦ),and there is a flow resistance peak in the range ofΦ=0.5~0.8.This is mainly caused by the coupling effect of high temperature physical proper⁃ties,flow rate and pressure of fuel in parallel channels.At the same time,the cooling and heat transfer character⁃istics of parallel channels under some conditions of high heat-mass ratio are obtained,and the main factors affect⁃ing the heat transfer of parallel channels such as improving surface roughness and strengthening heat transfer are mastered.In the experiment,whenΦis less than 0.9,the phenomenon of local heat transfer enhancement and deterioration can be obviously observed,and the temperature rise of local structures exceeds 200℃,which is the risk of structural damage.Therefore,the reliability of long-term parallel channel structure under the condition of high heat-mass ratio should be fully considered in structural design. 展开更多
关键词 Regenerative cooling Heat transfer Flow resistance ENGINE Parallel channel
下载PDF
Parallelization of motion compensation algorithm based on reconfigurable video array processor
19
作者 Xie Xiaoyan Lei Xiang +2 位作者 Zhou Jinna Zhu Yun Jiang Lin 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2019年第6期83-93,共11页
The new encoding tools of high efficiency video coding(HEVC) make the interpolation operation more complex in motion compensation(MC) for better video compression, but impose higher requirements on the computational e... The new encoding tools of high efficiency video coding(HEVC) make the interpolation operation more complex in motion compensation(MC) for better video compression, but impose higher requirements on the computational efficiency and control logic of the hardware architecture. The reconfigurable array processor can take into consideration both the computational efficiency and flexible switching of algorithms very well. Through mining the data dependency and parallelism among interpolation operation, this paper presents a parallelization method based on the dynamic reconfigurable array processor proposed by the project team. The number of pixels loaded from the external memory is reduced significantly, by multiplexing the common data in the previous reference block and the current reference block. Flexible switching of variable block operation is realized by using dynamic reconfiguration mechanism. A 16×16 processor element(PE)’s array is used to dynamically process a 4×4-64×64 block size. The experimental results show that, the reference block update speed is increased by 39.9%. In the case of an array size of 16 PEs, the number of pixels processed in parallel reaches 16. 展开更多
关键词 HEVC MC parallelization RECONFIGURABLE
原文传递
Parallelization and sustainability of distributed genetic algorithms on many-core processors
20
作者 Yuji Sato Mikiko Sato 《International Journal of Intelligent Computing and Cybernetics》 EI 2014年第1期2-23,共22页
Purpose–The purpose of this paper is to propose a fault-tolerant technology for increasing the durability of application programs when evolutionary computation is performed by fast parallel processing on many-core pr... Purpose–The purpose of this paper is to propose a fault-tolerant technology for increasing the durability of application programs when evolutionary computation is performed by fast parallel processing on many-core processors such as graphics processing units(GPUs)and multi-core processors(MCPs).Design/methodology/approach–For distributed genetic algorithm(GA)models,the paper proposes a method where an island’s ID number is added to the header of data transferred by this island for use in fault detection.Findings–The paper has shown that the processing time of the proposed idea is practically negligible in applications and also shown that an optimal solution can be obtained even with a single stuck-at fault or a transient fault,and that increasing the number of parallel threads makes the system less susceptible to faults.Originality/value–The study described in this paper is a new approach to increase the sustainability of application program using distributed GA on GPUs and MCPs. 展开更多
关键词 Evolutionary computation Genetic algorithms Fault identification Many-core processors parallelization
原文传递
上一页 1 2 112 下一页 到第
使用帮助 返回顶部