Frequent data exchange among all kinds of memories has become an inevitable phenomenon in the process of modern embeddedsoftware design. In order to improve the ability of the embedded system data's throughput and co...Frequent data exchange among all kinds of memories has become an inevitable phenomenon in the process of modern embeddedsoftware design. In order to improve the ability of the embedded system data's throughput and computation, most embeddeddevices introduce Enhanced Direct Memory Access (EDMA) data transfer technology. TMS320C6678 is a multi-core DSPproduced by Texas Instruments (TI). There are ten EDMA transmission controllers in the chip for configuration and datatransmissions are allowed to be performed between any two pieces of storage at the same time. This paper expounds the workingmechanism of EDMA based on multi-core DSP TMS320C6678. At the same time, multiple data sets are provided and thebottleneck of limiting data throughout is analyzed and solved.展开更多
Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardwar...Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardware system.This literature review focuses on calculating WCET for multi-core processors,providing a survey of traditional methods used for static and dynamic analysis and highlighting the major challenges that arise from different program execution scenarios on multi-core platforms.This paper outlines the strengths and weaknesses of current methodologies and offers insights into prospective areas of research on multi-core analysis.By presenting a comprehensive analysis of the current state of research on multi-core processor analysis for WCET estimation,this review aims to serve as a valuable resource for researchers and practitioners in the field.展开更多
微处理器芯片的生态建设是高端装备与智能微系统自主、可控的关键,尽管国产数字信号处理(digital signal processing, DSP)器件及其相关开发应用技术近年来得到了一定的发展,但与需求仍存在较大差距。在主动噪声控制领域,前馈型多通道...微处理器芯片的生态建设是高端装备与智能微系统自主、可控的关键,尽管国产数字信号处理(digital signal processing, DSP)器件及其相关开发应用技术近年来得到了一定的发展,但与需求仍存在较大差距。在主动噪声控制领域,前馈型多通道控制方案比单通道有较大的控制范围和较好的性能,但对系统的运算能力有较高的要求。文章以多通道FxLMS算法为基础,对多通道降噪系统的运算量进行了分析,依据国产DSP开发板的电路结构,设计了控制系统方案,并进行了实验研究。实验表明,所设计的噪声控制系统运算效率较ARM作为运算器提高了80%,对100~1 000 Hz内的周期性噪声信号衰减达到15~20 dB,证明了该方案的正确性。展开更多
The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,...The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.展开更多
Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to acc...Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors.展开更多
文摘Frequent data exchange among all kinds of memories has become an inevitable phenomenon in the process of modern embeddedsoftware design. In order to improve the ability of the embedded system data's throughput and computation, most embeddeddevices introduce Enhanced Direct Memory Access (EDMA) data transfer technology. TMS320C6678 is a multi-core DSPproduced by Texas Instruments (TI). There are ten EDMA transmission controllers in the chip for configuration and datatransmissions are allowed to be performed between any two pieces of storage at the same time. This paper expounds the workingmechanism of EDMA based on multi-core DSP TMS320C6678. At the same time, multiple data sets are provided and thebottleneck of limiting data throughout is analyzed and solved.
基金supported by ZTE Industry-University-Institute Cooperation Funds under Grant No.2022ZTE09.
文摘Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardware system.This literature review focuses on calculating WCET for multi-core processors,providing a survey of traditional methods used for static and dynamic analysis and highlighting the major challenges that arise from different program execution scenarios on multi-core platforms.This paper outlines the strengths and weaknesses of current methodologies and offers insights into prospective areas of research on multi-core analysis.By presenting a comprehensive analysis of the current state of research on multi-core processor analysis for WCET estimation,this review aims to serve as a valuable resource for researchers and practitioners in the field.
基金supported by Taif University Researchers Supporting Program(Project Number:TURSP-2020/195)Taif University,Saudi Arabia.Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R203)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.
文摘Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors.