期刊文献+
共找到56篇文章
< 1 2 3 >
每页显示 20 50 100
Research on Multi-Core Processor Analysis for WCET Estimation
1
作者 LUO Haoran HU Shuisong +2 位作者 WANG Wenyong TANG Yuke ZHOU Junwei 《ZTE Communications》 2024年第1期87-94,共8页
Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardwar... Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardware system.This literature review focuses on calculating WCET for multi-core processors,providing a survey of traditional methods used for static and dynamic analysis and highlighting the major challenges that arise from different program execution scenarios on multi-core platforms.This paper outlines the strengths and weaknesses of current methodologies and offers insights into prospective areas of research on multi-core analysis.By presenting a comprehensive analysis of the current state of research on multi-core processor analysis for WCET estimation,this review aims to serve as a valuable resource for researchers and practitioners in the field. 展开更多
关键词 real-time system worst-case execution time(WCET) multi-core analysis
下载PDF
Shared Cache Based on Content Addressable Memory in a Multi-Core Architecture
2
作者 Allam Abumwais Mahmoud Obaid 《Computers, Materials & Continua》 SCIE EI 2023年第3期4951-4963,共13页
Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to acc... Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors. 展开更多
关键词 multi-core processor shared cache content addressable memory dual port CAM replacement algorithm benchmark program
下载PDF
Hybridization of Metaheuristics Based Energy Efficient Scheduling Algorithm for Multi-Core Systems
3
作者 J.Jean Justus U.Sakthi +4 位作者 K.Priyadarshini B.Thiyaneswaran Masoud Alajmi Marwa Obayya Manar Ahmed Hamza 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期205-219,共15页
The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,... The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS. 展开更多
关键词 Task scheduling energy efficiency multi-core systems fitness function MAKESPAN
下载PDF
An MPI parallel DEM-IMB-LBM framework for simulating fluid-solid interaction problems
4
作者 Ming Xia Liuhong Deng +3 位作者 Fengqiang Gong Tongming Qu Y.T.Feng Jin Yu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2219-2231,共13页
The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive comp... The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework. 展开更多
关键词 Discrete element method(DEM) Lattice Boltzmann method(LBM) Immersed moving boundary(IMB) multi-cores parallelization Message passing interface(MPI) CPU Submarine landslides
下载PDF
Performance Enhancement of XML Parsing Using Regression and Parallelism
5
作者 Muhammad Ali Minhaj Ahmad Khan 《Computer Systems Science & Engineering》 2024年第2期287-303,共17页
The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj... The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files. 展开更多
关键词 Regression parallel parsing multi-cores XML
下载PDF
Enrichment of Fetal Nucleated Red Blood Cells by Multi-core Magnetic Composite Particles for Non-invasive Prenatal Diagnosis 被引量:1
6
作者 PAN Ying WANG Qing +7 位作者 HUANG Wen-jun QIAO Feng-1i LIU Yu-ping ZHANG Yu-cheng HAI De-yang DU Ying,ting WANG Wen-yue ZHANG Ai-chen 《Chemical Research in Chinese Universities》 SCIE CAS CSCD 2012年第3期443-448,共6页
A novel kind of multi-core magnetic composite particles, the surfaces of which were respectively mo- dified with goat-anti-mouse IgG and antitransferrin receptor(anti-CD71), was prepared. The fetal nucleated red blo... A novel kind of multi-core magnetic composite particles, the surfaces of which were respectively mo- dified with goat-anti-mouse IgG and antitransferrin receptor(anti-CD71), was prepared. The fetal nucleated red blood cells(FNRBCs) in the peripheral blood of a gravida were rapidly and effectively enriched and separated by the mo- dified multi-core magnetic composite particles in an external magnetic field. The obtained FNRBCs were used for the identification of the fetal sex by means of fluorescence in situ hybridization(FISH) technique. The results demonstrate that the multi-core magnetic composite particles meet the requirements for the enrichment and speration of FNRBCs with a low concentration and the accuracy of detetion for the diagnosis of fetal sex reached to 95%. Moreover, the obtained FNRBCs were applied to the non-invasive diagnosis of Down syndrome and chromosome 3p21 was de- tected. The above facts indicate that the novel multi-core magnetic composite particles-based method is simple, relia- ble and cost-effective and has opened up vast vistas for the potential application in clinic non-invasive prenatal diag- nosis. 展开更多
关键词 Fetal nucleated red blood cell(FNRBC) Prenatal diagnosis NON-INVASIVE multi-core magnetic compositeparticle
下载PDF
Variation-Aware Task Mapping on Homogeneous Fault-Tolerant Multi-Core Network-on-Chips
7
作者 Chengbo Xue Yougen Xu +1 位作者 Yue Hao Wei Gao 《Journal of Beijing Institute of Technology》 EI CAS 2019年第3期497-509,共13页
A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time geneti... A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time genetic task mapping algorithm is proposed during the design stage to generate multiple task mapping solutions which cover a maximum range of chips. Then, during the run, one optimal task mapping solution is selected. Additionally, logical cores are mapped to physically available cores. Both core asymmetry and topological changes are considered in the proposed approach. Experimental results show that the performance yield of the proposed approach is 96% on average, and the communication cost, power consumption and peak temperature are all optimized without loss of performance yield. 展开更多
关键词 process VARIATION TASK mapping FAULT-TOLERANT network-on-chips multi-corE
下载PDF
Theoretical Analysis on Inter-Core Crosstalk Suppression Model for Multi-Core Fiber
8
作者 Jiajing Tu Xueqin Xie Keping Long 《China Communications》 SCIE CSCD 2016年第8期192-197,共6页
Decreasing mode coupling coefficient(κ) is an effective approach to suppress the inter-core crosstalk. Therefore, we deploy a low index rod and rectangle trench in the middle of two neighboring cores to reduce κ so ... Decreasing mode coupling coefficient(κ) is an effective approach to suppress the inter-core crosstalk. Therefore, we deploy a low index rod and rectangle trench in the middle of two neighboring cores to reduce κ so that the overlap of electric field distribution can be suppressed. We also propose approximate analytical solution(AAS) for κ of two crosstalk suppression models, which are two cores with one low index rod deployed in the middle and two cores with one low index rectangle trench deployed in the middle. We then do some modification for the results obtained by AAS and the modified results are proved to agree well with that obtained by finite element method(FEM). Therefore, we can use the modified AAS to get inter-core crosstalk for abovementioned two models quickly. 展开更多
关键词 multi-core fiber CROSSTALK mode coupling coefficient
下载PDF
Modeling of Few-Mode Multi-Core Optical Fiber Channe Based on Non-Uniform Mode Field Distribution
9
作者 Hang Zhou Bo Liu +6 位作者 Fu Wang Dandan Song Li Li Xiangjun Xin Qinghua Tian Qi Zhang Feng Tian 《China Communications》 SCIE CSCD 2016年第8期184-191,共8页
In this paper, the influencing factors that affect few-mode and multi core optical fiber channel are analyzed in a comprehensive way. The theoretical modeling and computer simulation of the information channel are car... In this paper, the influencing factors that affect few-mode and multi core optical fiber channel are analyzed in a comprehensive way. The theoretical modeling and computer simulation of the information channel are carried out and then the modeling scheme of few-mode multicore optical fiber channel based on non-uniform mode field distribution is put forward. The proposed modeling scheme can not only exponentially increases the system capacity through fewmode multi-core optical fiber channel, but has better transmission performance compared to the channel of the same type to the uniform channel revealing from the simulation results. 展开更多
关键词 few-mode multi-core optical fiber channel non-uniform channel channel modeling
下载PDF
RF-TSV DESIGN, MODELING AND APPLICATION FOR 3D MULTI-CORE COMPUTER SYSTEMS
10
作者 Yu Le Yang Haigang Xie Yuanlu 《Journal of Electronics(China)》 2012年第5期431-444,共14页
The state-of-the-art multi-core computer systems are based on Very Large Scale three Dimensional (3D) Integrated circuits (VLSI). In order to provide high-speed vertical data transmission in such 3D systems, efficient... The state-of-the-art multi-core computer systems are based on Very Large Scale three Dimensional (3D) Integrated circuits (VLSI). In order to provide high-speed vertical data transmission in such 3D systems, efficient Through-Silicon Via (TSV) technology is critically important. In this paper, various Radio Frequency (RF) TSV designs and models are proposed. Specifically, the Cu-plug TSV with surrounding ground TSVs is used as the baseline structure. For further improvement, the dielectric coaxial and novel air-gap coaxial TSVs are introduced. Using the empirical parameters of these coaxial TSVs, the simulation results are obtained demonstrating that these coaxial RF-TSVs can provide two-order higher of cut-off frequencies than the Cu-plug TSVs. Based on these new RF-TSV technologies, we propose a novel 3D multi-core computer system as well as new architectures for manipulating the interfaces between RF and baseband circuit. Taking into consideration the scaling down of IC manufacture technologies, predictions for the performance of future generations of circuits are made. With simulation results indicating energy per bit and area per bit being reduced by 7% and 11% respectively, we can conclude that the proposed method is a worthwhile guideline for the design of future multi-core computer ICs. 展开更多
关键词 Three Dimensional (3D) Very Large Scale Integrated circuits (VLSI) Ratio Frequency (RF) Through-Silicon Vias (TSVs) multi-core computer technology
下载PDF
The Channel Maps and the Position-Velocity Diagrams of Multi-core Structure of Cepheus C
11
作者 Yu Zhi yao 1,2 , Jiang Dong rong 1,2 1 (Shanghai Astronomical Observatory, The Chinese Academy of Sciences, Shanghai 200030, China) 2 (National Astronomical Observatories, The Chinese Academy of Sciences, China E mail: zyyu@center. shao. ac. 《天文研究与技术》 CSCD 1999年第S1期218-221,共4页
The first important problem in the star forming process is the formation of proto star core in star forming regions of molecular cloud. The multi core structure in star forming regions is related to the forming of pro... The first important problem in the star forming process is the formation of proto star core in star forming regions of molecular cloud. The multi core structure in star forming regions is related to the forming of proto star core. The molecular radiation of C 18 O( J = 1-0) in Cepheus C has been observed. The C 18 O( J = 1-0) observations form the basis for an interesting study on the cloud cores and star formation activity in the cores of the Cepheus C. In order to study the multi core structure of C 18 O( J = 1-0) in the Cepheus C the channel maps and the position velocity diagrams of C 18 O( J = 1-0) will be shown. From the maps it is found that the contour level and distribution size of the three cores in Cepheus C are related to the channel velocity very much. The channel velocity of C 18 O( J = 1-0) molecules in core b, which distributed in all the channels velocity, is different with one in core a and core c very much. The C 18 O( J = 1-0) molecules in core a and core c of the Cepheus C mostly distributed in the blue shifted channel velocity relating to peak velocity, and only in -10.0 ~ -9.5 km/s, which is the red shifted channel velocity relating to peak velocity. And the contour level of C 18 O( J = 1-0) in -10.0 ~ -9.5 km/s is small and the distrbution size in the channel map is small. According to the position velocity diagrams the asymmetry of the distribution both blue shifted and red shifted components should reflect the asymmetry of the profile. From the diagrams it also is found that the contour level and the distribution size of the three cores are different from each other. Both results from the maps and diagrams are coincident with each other. 展开更多
关键词 MAPS The Channel Maps and the Position-Velocity core Diagrams of multi-core Structure of Cepheus C
下载PDF
Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on Multi-Core Processor
12
作者 Zhang Ziran,Li Jun,Li Changxiao(ZTE Corporation,Shenzhen 518057,P.R.China) 《ZTE Communications》 2009年第1期54-58,共5页
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co... The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well. 展开更多
关键词 CORE LTE Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on multi-core Processor Design
下载PDF
Performance Analysis for EDMA Based on TIC6678Multi-core DSP
13
《信息工程期刊(中英文版)》 2015年第3期73-77,共5页
Frequent data exchange among all kinds of memories has become an inevitable phenomenon in the process of modern embeddedsoftware design. In order to improve the ability of the embedded system data's throughput and co... Frequent data exchange among all kinds of memories has become an inevitable phenomenon in the process of modern embeddedsoftware design. In order to improve the ability of the embedded system data's throughput and computation, most embeddeddevices introduce Enhanced Direct Memory Access (EDMA) data transfer technology. TMS320C6678 is a multi-core DSPproduced by Texas Instruments (TI). There are ten EDMA transmission controllers in the chip for configuration and datatransmissions are allowed to be performed between any two pieces of storage at the same time. This paper expounds the workingmechanism of EDMA based on multi-core DSP TMS320C6678. At the same time, multiple data sets are provided and thebottleneck of limiting data throughout is analyzed and solved. 展开更多
关键词 EDMA multi-corE DSP HIGH-SPEED Data THROUGHOUT
下载PDF
A Scalable Interconnection Scheme in Many-Core Systems
14
作者 Allam Abumwais Mujahed Eleyat 《Computers, Materials & Continua》 SCIE EI 2023年第10期615-632,共18页
Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection n... Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection network that tries to address two major problems.First,the overhead of power and area cost and its effect on scalability.Second,high access latency is caused by multiple cores’simultaneous accesses of the same shared module.This paper presents an interconnection scheme called N-conjugate Shuffle Clusters(NCSC)based on multi-core multicluster architecture to reduce the overhead of the just mentioned problems.NCSC eliminated the need for router devices and their complexity and hence reduced the power and area costs.It also resigned and distributed the shared caches across the interconnection network to increase the ability for simultaneous access and hence reduce the access latency.For intra-cluster communication,Multi-port Content Addressable Memory(MPCAM)is used.The experimental results using four clusters and four cores each indicated that the average access latency for a write process is 1.14785±0.04532 ns which is nearly equal to the latency of a write operation in MPCAM.Moreover,it was demonstrated that the average read latency within a cluster is 1.26226±0.090591 ns and around 1.92738±0.139588 ns for read access between cores from different clusters. 展开更多
关键词 MANY-CORE multi-corE N-conjugate shuffle multi-port content addressable memory interconnection network
下载PDF
Comparative Evaluation of Data Mining Algorithms in Breast Cancer
15
作者 Fuad A.M.Al-Yarimi 《Computers, Materials & Continua》 SCIE EI 2023年第10期633-645,共13页
Unchecked breast cell growth is one of the leading causes of death in women globally and is the cause of breast cancer.The only method to avoid breast cancer-related deaths is through early detection and treatment.The... Unchecked breast cell growth is one of the leading causes of death in women globally and is the cause of breast cancer.The only method to avoid breast cancer-related deaths is through early detection and treatment.The proper classification of malignancies is one of the most significant challenges in the medical industry.Due to their high precision and accuracy,machine learning techniques are extensively employed for identifying and classifying various forms of cancer.Several data mining algorithms were studied and implemented by the author of this review and compared them to the present parameters and accuracy of various algorithms for breast cancer diagnosis such that clinicians might use them to accurately detect cancer cells early on.This article introduces several techniques,including support vector machine(SVM),K star(K∗)classifier,Additive Regression(AR),Back Propagation Neural Network(BP),and Bagging.These algorithms are trained using a set of data that contains tumor parameters from breast cancer patients.Comparing the results,the author found that Support Vector Machine and Bagging had the highest precision and accuracy,respectively.Also,assess the number of studies that provide machine learning techniques for breast cancer detection. 展开更多
关键词 MANY-CORE multi-corE N-conjugate shuffle multi-port content addressable memory interconnection network
下载PDF
Discrete Event Simulation-Based Evaluation of a Single-Lane Synchronized Dual-Traffic Light Intersections
16
作者 Chimezie Calistus Ogharandukun Martin +1 位作者 Abdullahi Monday Essien Joe 《Journal of Computer and Communications》 2023年第10期82-100,共19页
This research involved an exploratory evaluation of the dynamics of vehicular traffic on a road network across two traffic light-controlled junctions. The study uses the case study of a one-kilometer road system model... This research involved an exploratory evaluation of the dynamics of vehicular traffic on a road network across two traffic light-controlled junctions. The study uses the case study of a one-kilometer road system modelled on Anylogic version 8.8.4. Anylogic is a multi-paradigm simulation tool that supports three main simulation methodologies: discrete event simulation, agent-based modeling, and system dynamics modeling. The system is used to evaluate the implication of stochastic time-based vehicle variables on the general efficiency of road use. Road use efficiency as reflected in this model is based on the percentage of entry vehicles to exit the model within a one-hour simulation period. The study deduced that for the model under review, an increase in entry point time delay has a domineering influence on the efficiency of road use far beyond any other consideration. This study therefore presents a novel approach that leverages Discrete Events Simulation to facilitate efficient road management with a focus on optimum road use efficiency. The study also determined that the inclusion of appropriate random parameters to reflect road use activities at critical event points in a simulation can help in the effective representation of authentic traffic models. The Anylogic simulation software leverages the Classic DEVS and Parallel DEVS formalisms to achieve these objectives. 展开更多
关键词 multi-core Processing Distributed Computing Event-Driven Modelling Discrete Event Simulation Data Analysis and Visualization
下载PDF
Large-Eddy Simulation of Airflow over a Steep, Three-Dimensional Isolated Hill with Multi-GPUs Computing
17
作者 Takanori Uchida 《Open Journal of Fluid Dynamics》 2018年第4期416-434,共19页
The present research attempted a Large-Eddy Simulation (LES) of airflow over a steep, three-dimensional isolated hill by using the latest multi-cores multi-CPUs systems. As a result, it was found that 1) turbulence si... The present research attempted a Large-Eddy Simulation (LES) of airflow over a steep, three-dimensional isolated hill by using the latest multi-cores multi-CPUs systems. As a result, it was found that 1) turbulence simulations using approximately 50 million grid points are feasible and 2) the use of this system resulted in the achievement of a high computation speed, which exceeded the speed of parallel computation attained by a single CPU on one of the latest supercomputers. Furthermore, LES was conducted by using the multi-GPUs systems. The results of these simulations revealed the following findings: 1) the multi-GPUs environment which used the NVDIA? Tesla M2090 or the M2075 could simulate turbulence in a model with as many as approximately 50 million grid points. 2) The computation speed achieved by the multi-GPUs environments exceeded that by parallel computation which used four to six CPUs of one of the latest supercomputers. 展开更多
关键词 LES ISOLATED HILL multi-cores Multi-CPUs COMPUTING Multi-GPUs COMPUTING
下载PDF
Implementing Delay Multiply and Sum Beamformer on a Hybrid CPU-GPU Platform for Medical Ultrasound Imaging Using Open MP and CUDA 被引量:2
18
作者 Ke Song Paul Liu Dongquan Liu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期1133-1150,共18页
Anovel beamforming algorithmnamed Delay Multiply and Sum(DMAS),which excels at enhancing the resolution and contrast of ultrasonic image,has recently been proposed.However,there are nested loops in this algorithm,so t... Anovel beamforming algorithmnamed Delay Multiply and Sum(DMAS),which excels at enhancing the resolution and contrast of ultrasonic image,has recently been proposed.However,there are nested loops in this algorithm,so the calculation complexity is higher compared to the Delay and Sum(DAS)beamformer which is widely used in industry.Thus,we proposed a simple vector-based method to lower its complexity.The key point is to transform the nested loops into several vector operations,which can be efficiently implemented on many parallel platforms,such as Graphics Processing Units(GPUs),and multi-core Central Processing Units(CPUs).Consequently,we considered to implement this algorithm on such a platform.In order to maximize the use of computing power,we use the GPUs andmulti-core CPUs inmixture.The platform used in our test is a low cost Personal Computer(PC),where a GPU and a multi-core CPU are installed.The results show that the hybrid use of a CPU and a GPU can get a significant performance improvement in comparison with using a GPU or using amulti-core CPU alone.The performance of the hybrid system is increased by about 47%–63%compared to a single GPU.When 32 elements are used in receiving,the fame rate basically can reach 30 fps.In the best case,the frame rate can be increased to 40 fps. 展开更多
关键词 BEAMFORMING delay multiply and sum graphics processing unit multi-core central processing unit
下载PDF
PARALLEL IMPLEMENTATION AND OPTIMIZATION OF THE SEBVHOS ALGORITHM 被引量:2
19
作者 Li Wen Guo Li Yuan Hongxing Wei Yifang Guan Hua 《Journal of Electronics(China)》 2011年第3期277-283,共7页
In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, ... In this paper, a parallel Surface Extraction from Binary Volumes with Higher-Order Smoothness (SEBVHOS) algorithm is proposed to accelerate the SEBVHOS execution. The original SEBVHOS algorithm is parallelized first, and then several performance optimization techniques which are loop optimization, cache optimization, false sharing optimization, synchronization overhead op-timization, and thread affinity optimization, are used to improve the implementation's performance on multi-core systems. The performance of the parallel SEBVHOS algorithm is analyzed on a dual-core system. The experimental results show that the parallel SEBVHOS algorithm achieves an average of 1.86x speedup. More importantly, our method does not come with additional aliasing artifacts, com-paring to the original SEBVHOS algorithm. 展开更多
关键词 multi-corE Parallel algorithm Performance optimization 3D reconstruction
下载PDF
Numerical analysis of photonic crystal fiber with chalcogenide core tellurite cladding composite microstructure 被引量:1
20
作者 刘硕 李曙光 《Chinese Physics B》 SCIE EI CAS CSCD 2013年第7期218-222,共5页
Kinds of photonic crystal fibers with chalcogenide core tellurite cladding composite microstructure are proposed. The multi-core photonic crystal fiber can reach the higher nonlinearity coefficient and the larger effe... Kinds of photonic crystal fibers with chalcogenide core tellurite cladding composite microstructure are proposed. The multi-core photonic crystal fiber can reach the higher nonlinearity coefficient and the larger effective mode area. The small single-core photonic crystal fiber has a very high nonlinearity coefficient. At the wavelength λ=0.8μm, the nonlinearity coefficient can reach 31.37053 W-1·m-1, at the wavelength λ=1.55μm, the nonlinearity coefficient is 11.19686W-1·m-1. 展开更多
关键词 multi-core photonic crystal fiber effective mode area nonlinearity coefficient dispersion
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部