期刊文献+
共找到156篇文章
< 1 2 8 >
每页显示 20 50 100
Optimization Task Scheduling Using Cooperation Search Algorithm for Heterogeneous Cloud Computing Systems 被引量:1
1
作者 Ahmed Y.Hamed M.Kh.Elnahary +1 位作者 Faisal S.Alsubaei Hamdy H.El-Sayed 《Computers, Materials & Continua》 SCIE EI 2023年第1期2133-2148,共16页
Cloud computing has taken over the high-performance distributed computing area,and it currently provides on-demand services and resource polling over the web.As a result of constantly changing user service demand,the ... Cloud computing has taken over the high-performance distributed computing area,and it currently provides on-demand services and resource polling over the web.As a result of constantly changing user service demand,the task scheduling problem has emerged as a critical analytical topic in cloud computing.The primary goal of scheduling tasks is to distribute tasks to available processors to construct the shortest possible schedule without breaching precedence restrictions.Assignments and schedules of tasks substantially influence system operation in a heterogeneous multiprocessor system.The diverse processes inside the heuristic-based task scheduling method will result in varying makespan in the heterogeneous computing system.As a result,an intelligent scheduling algorithm should efficiently determine the priority of every subtask based on the resources necessary to lower the makespan.This research introduced a novel efficient scheduling task method in cloud computing systems based on the cooperation search algorithm to tackle an essential task and schedule a heterogeneous cloud computing problem.The basic idea of thismethod is to use the advantages of meta-heuristic algorithms to get the optimal solution.We assess our algorithm’s performance by running it through three scenarios with varying numbers of tasks.The findings demonstrate that the suggested technique beats existingmethods NewGenetic Algorithm(NGA),Genetic Algorithm(GA),Whale Optimization Algorithm(WOA),Gravitational Search Algorithm(GSA),and Hybrid Heuristic and Genetic(HHG)by 7.9%,2.1%,8.8%,7.7%,3.4%respectively according to makespan. 展开更多
关键词 heterogeneous processors cooperation search algorithm task scheduling cloud computing
下载PDF
Shared Cache Based on Content Addressable Memory in a Multi-Core Architecture
2
作者 Allam Abumwais Mahmoud Obaid 《Computers, Materials & Continua》 SCIE EI 2023年第3期4951-4963,共13页
Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to acc... Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors. 展开更多
关键词 multi-core processor shared cache content addressable memory dual port CAM replacement algorithm benchmark program
下载PDF
Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on Multi-Core Processor
3
作者 Zhang Ziran,Li Jun,Li Changxiao(ZTE Corporation,Shenzhen 518057,P.R.China) 《ZTE Communications》 2009年第1期54-58,共5页
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co... The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well. 展开更多
关键词 CORE LTE Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on multi-core processor Design
下载PDF
基于Amdahl定律的异构多核密码处理器能效模型研究
4
作者 李伟 郎俊豪 +1 位作者 陈韬 南龙梅 《电子学报》 EI CAS CSCD 北大核心 2024年第3期849-862,共14页
边缘计算安全的资源受限特征及各种新型密码技术的应用,对多核密码处理器的高能效、异构性提出需求,但当前尚缺乏相关的异构多核能效模型研究.本文基于扩展Amdahl定律,引入密码串并特征、异构多核结构、数据准备时间、动态电压频率调节... 边缘计算安全的资源受限特征及各种新型密码技术的应用,对多核密码处理器的高能效、异构性提出需求,但当前尚缺乏相关的异构多核能效模型研究.本文基于扩展Amdahl定律,引入密码串并特征、异构多核结构、数据准备时间、动态电压频率调节等因素,将核划分空闲、活跃状态,建立异构多核密码处理器的能效模型.MATLAB仿真结果表明,数据准备时间占比小于10%时,对能效的负面影响大幅下降;固定电压,频率缩放会影响能效值大小;处理器核空闲/活跃能耗比例越小,能效值越大.架构上,固定异构核,同构核数量与密码任务最大并行度相等时能效值最大,最佳异构核数可由模型变化参数仿真得到;多任务调度执行上,流水与并发执行有利于能效值的进一步提升.多核密码处理器芯片板级测试结果表明,仿真结果与实测数据相关系数接近1,芯片实测的数据准备时间、电压频率缩放等因素的影响与仿真分析基本一致,验证了所提能效模型的有效性.该文重点从影响能效变化趋势因素上,为多核密码处理器异构、高能效设计提供一定的理论分析基础与建议. 展开更多
关键词 密码处理器 多核处理器 异构 AMDAHL定律 能效模型
下载PDF
一种基于异构处理器的可动态布署设计与实现
5
作者 钱宏文 陈光威 《电子技术应用》 2024年第1期93-100,共8页
针对卫星支持的多种生活服务需求实时切换、资源灵活智能调用需求,基于无线广域信号服务异构处理器,设计了一种即时高效、动态切换部署处理器功能的方案。通过对大资源FPGA及多片8核DSP多种功能定制结合动态部署设计,实现实时动态可重... 针对卫星支持的多种生活服务需求实时切换、资源灵活智能调用需求,基于无线广域信号服务异构处理器,设计了一种即时高效、动态切换部署处理器功能的方案。通过对大资源FPGA及多片8核DSP多种功能定制结合动态部署设计,实现实时动态可重构处理器系统功能,将5种FPGA应用结合2种DSP应用程序动态组合,配合各功能任务架构需求重建控制、数据链路,完成多任务智能切换。 展开更多
关键词 异构处理器 动态部署 可重构 FPGA DSP
下载PDF
面向申威众核处理器的规则处理优化技术
6
作者 张振东 王彤 刘鹏 《计算机研究与发展》 EI CSCD 北大核心 2024年第1期66-85,共20页
高性能口令恢复系统是申威众核处理器的重要应用场景之一,规则处理是主流口令恢复工具中被广泛应用的一种口令生成方式.现有相关研究工作缺少对规则处理算法的优化,导致申威处理器上基于规则的口令生成速度成为口令恢复系统的性能瓶颈.... 高性能口令恢复系统是申威众核处理器的重要应用场景之一,规则处理是主流口令恢复工具中被广泛应用的一种口令生成方式.现有相关研究工作缺少对规则处理算法的优化,导致申威处理器上基于规则的口令生成速度成为口令恢复系统的性能瓶颈.通过分析规则处理算法的多层次可并行性,提出了面向申威众核处理器的线程级、数据级优化方案.在线程级优化方案中,探索了规则处理算法的最优任务映射方式,设计了主从核任务分配机制、从核缓冲区配比优化机制、负载均衡机制、变长规则存储机制等技术以提高并行效率;在数据级优化方案中,分析了规则处理算法中规则函数的计算模式,并通过申威SIMD指令集对规则函数进行向量优化以提高执行效率.在SW26010处理器上的实验结果表明,上述优化方案有效解除了规则处理的性能瓶颈,使规则模式下的口令恢复速度提升了30~101倍. 展开更多
关键词 申威众核处理器 口令恢复 规则处理 异构计算 单指令多数据流
下载PDF
适用于S-NUCA异构处理器的任务调度与热管理系统
7
作者 周义涛 李阳 +3 位作者 韩超 赵玉来 汪玲 李建华 《计算机工程》 CAS CSCD 北大核心 2024年第2期196-205,共10页
异构多核处理器凭借其高性能、低功耗和广泛的应用场景而成为当前计算机平台的主流方案,且大容量的非均匀缓存架构(S-NUCA)具有较低的平均访问时间。然而,不断上升的晶体管规模给异构多核处理器的资源调度和功耗控制带来挑战,传统的调... 异构多核处理器凭借其高性能、低功耗和广泛的应用场景而成为当前计算机平台的主流方案,且大容量的非均匀缓存架构(S-NUCA)具有较低的平均访问时间。然而,不断上升的晶体管规模给异构多核处理器的资源调度和功耗控制带来挑战,传统的调度算法在面对基于S-NUCA的多核处理器时忽略了核心之间的缓存访问延迟,且传统热管理方案只提供芯片级功率约束,容易使得系统因核心使用率降低而造成性能下降。为此,提出一种适用于S-NUCA异构多核系统、满足热安全约束的动态线程调度机制TSCDM。利用基于动态每周期指令(IPC)值的阶段检测技术,并基于人工神经网络预测线程的IPC值,以获取线程与核心类型的最佳绑定关系,依据S-NUCA缓存特性获得最优映射和基于任务分类的任务迁移策略。在此基础上,TSCDM基于片上热模型为每个核心实时分配功率预算。在HotSniper上运行SPLASH-2性能测试套件进行实验,结果表明,相较于传统调度方案与基于机器学习的调度方案,TSCDM在加速比和资源利用率上均表现出优势,TSCDM中使用的基于瞬态温度的安全功率算法相比传统热安全功率算法能够降低核心热余量,同时处理器的全频段均有更高的能效比。 展开更多
关键词 异构多核处理器 人工神经网络 线程调度 阶段检测 热安全功率
下载PDF
Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity
8
作者 Zhi-xiang CHEN Zhao-lin LI +2 位作者 Shan CAO Fang WANG Jie ZHOU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2015年第12期1018-1033,共16页
Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturin... Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturing process lead to a significant spread in the operating speeds of cores within homogeneous multi-core processors. Task scheduling approaches, which do not consider such heterogeneity caused by within-die variations,can lead to an overly pessimistic result in terms of performance. To realize an optimal performance according to the actual maximum clock frequencies at which cores can run, we present a heterogeneity-aware schedule refining(HASR) scheme by fully exploiting the heterogeneities of homogeneous multi-core processors in embedded domains.We analyze and show how the actual maximum frequencies of cores are used to guide the scheduling. In the scheme,representative chip operating points are selected and the corresponding optimal schedules are generated as candidate schedules. During the booting of each chip, according to the actual maximum clock frequencies of cores, one of the candidate schedules is bound to the chip to maximize the performance. A set of applications are designed to evaluate the proposed scheme. Experimental results show that the proposed scheme can improve the performance by an average value of 22.2%, compared with the baseline schedule based on the worst case timing analysis. Compared with the conventional task scheduling approach based on the actual maximum clock frequencies, the proposed scheme also improves the performance by up to 12%. 展开更多
关键词 Schedule refining multi-core processor heterogenEITY Representative chip operating point
原文传递
YHFT-QDSP:High-Performance Heterogeneous Multi-Core DSP
9
作者 陈书明 万江华 +8 位作者 鲁建壮 刘仲 孙海燕 孙永节 刘衡竹 刘祥远 李振涛 徐毅 陈小文 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第2期214-224,共11页
Multi-core architectures are widely used to in time-to-market and power consumption of the chips enhance the microprocessor performance within a limited increase Toward the application of high-density data signal pro... Multi-core architectures are widely used to in time-to-market and power consumption of the chips enhance the microprocessor performance within a limited increase Toward the application of high-density data signal processing, this paper presents a novel heterogeneous multi-core architecture digital signal processor (DSP), YHFT-QDSP, with one RISC CPU core and 4 VLIW DSP cores. By three kinds of interconnection, YHFT-QDSP provides high efficiency message communication for inner-chip RISC core and DSP cores, inner-chip and inter-chip DSP cores. A parallel programming platform is specifically developed for the heterogeneous nmlti-core architecture of YHFT-QDSP. This parallel programming environment provides a parallel support library and a friendly interface between high level application softwares and multi- core DSP. The 130 nm CMOS custom chip design results benchmarks show that the interconnection structure of in a high speed and moderate power design. The results of typical YHFT-QDSP is much better than other related structures and achieves better speedup when using the interconnection facilities in combing methods. YHFT-QDSP has been signed off and manufactured presently. The future applications of the multi-core chip could be found in 3G wireless base station, high performance radar, industrial applications, and so on. 展开更多
关键词 digital signal processor (DSP) multi-core ARCHITECTURE parallel programming custom design
原文传递
基于麻雀搜索算法的异构多核处理器任务调度
10
作者 程小辉 童辉辉 康燕萍 《计算机应用与软件》 北大核心 2023年第4期211-216,共6页
为满足应用程序的多样性需求,提高异构多核环境下的任务调度效率,基于麻雀搜索算法(Sparrow Search Algorithm,SSA),提出一种新的异构多核处理器任务调度算法。该问题是以执行任务完成的时间最短为目标,并使用SSA对其优化。根据任务优... 为满足应用程序的多样性需求,提高异构多核环境下的任务调度效率,基于麻雀搜索算法(Sparrow Search Algorithm,SSA),提出一种新的异构多核处理器任务调度算法。该问题是以执行任务完成的时间最短为目标,并使用SSA对其优化。根据任务优先权规则,设计任务分配编码方案,将麻雀搜索空间映射到离散空间,使麻雀搜索算法更能适用于离散的异构多核任务调度问题研究上。实验表明,SSA寻优能力强、收敛速度快、性能好。与目前应用广泛的GA和IPSO相比较,其执行时间分别缩短21.48%和17.52%。在异构多核处理器任务调度领域中具有良好的研究意义,应用前景十分广泛。 展开更多
关键词 异构多核处理器 任务调度 麻雀搜索算法
下载PDF
面向申威架构的KNN并行算法实现与优化 被引量:2
11
作者 王其涵 庞建民 +3 位作者 岳峰 祝迪 沈莉 肖谦 《计算机工程》 CAS CSCD 北大核心 2023年第5期286-294,共9页
K近邻(KNN)是人工智能中最常用的分类算法,其性能提升对于海量数据的整理分析、大数据分类等任务具有重要意义。目前新一代神威超级计算机正处于应用发展的初始阶段,结合新一代申威异构众核处理器的结构特性,充分利用庞大的计算资源实... K近邻(KNN)是人工智能中最常用的分类算法,其性能提升对于海量数据的整理分析、大数据分类等任务具有重要意义。目前新一代神威超级计算机正处于应用发展的初始阶段,结合新一代申威异构众核处理器的结构特性,充分利用庞大的计算资源实现高效的KNN算法是海量数据分析整理的现实需求。根据SW26010pro处理器的结构特性,采用主从加速编程模型实现一种基础版本的KNN并行算法,其将计算核心传输到从核上,实现了线程级并行。分析影响基础并行算法性能的关键因素并提出优化算法SWKNN,不同于基础并行KNN算法的任务划分方式,SWKNN采用任务重划分策略,以避免冗余计算开销。通过数据流水优化、从核间通信优化、二次负载均衡优化等步骤减少不必要的通信开销,从而有效缓解访存压力并进一步提升算法性能。实验结果表明,与串行KNN算法相比,面向申威架构的基础并行KNN算法在SW26010pro处理器的单核组上可以获得最高48倍的加速效果,在同等数据规模下,SWKNN算法较基础并行KNN算法又可以获得最高399倍的加速效果。 展开更多
关键词 异构众核处理器 K近邻算法 并行计算 算法优化 分类性能
下载PDF
面向高可靠汽车电子系统的低延时异构多核并行差错检测方法
12
作者 吕浙帆 王天成 李华伟 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2023年第11期1789-1801,共13页
与业界常用的双核锁步方法相比,异构并行差错检测技术以较小的面积开销实现接近的差错覆盖率,但是会增加差错检测延时并影响主核的性能.针对差错检测不及时带来的潜在安全风险,提出一种低延时的异构并行差错检测方法.首先通过复制寄存... 与业界常用的双核锁步方法相比,异构并行差错检测技术以较小的面积开销实现接近的差错覆盖率,但是会增加差错检测延时并影响主核的性能.针对差错检测不及时带来的潜在安全风险,提出一种低延时的异构并行差错检测方法.首先通过复制寄存器时暂停物理寄存器释放的策略降低复制寄存器对主核性能的影响;然后利用主核控制流指导检查核取指,并基于预测检查核运行时间来划分程序段,以提升差错检测的性能,使得最大差错检测延时可控.使用1个开源香山处理器核作为主核,16个开源Rocket处理器作为检查核进行了方法实现,采用基准程序评估的实验结果表明,所提方法能够以50%的逻辑开销和22%的存储开销实现差错检测,小于双核锁步接近100%的面积开销.同时,在主核上的平均性能开销小于1%,且能将差错检测延迟控制在2000个时钟周期以内.此外,与原有分支预测策略相比,检查核的平均性能提升了14.9%. 展开更多
关键词 差错检测 容错 可靠性 异构处理器 锁步
下载PDF
Cooperative Computing Techniques for a Deeply Fused and Heterogeneous Many-Core Processor Architecture 被引量:13
13
作者 郑方 李宏亮 +3 位作者 吕晖 过锋 许晓红 谢向辉 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第1期145-162,共18页
Due to advances in semiconductor techniques, many-core processors have been widely used in high performance computing. However, many applications still cannot be carried out efficiently due to the memory wall, which h... Due to advances in semiconductor techniques, many-core processors have been widely used in high performance computing. However, many applications still cannot be carried out efficiently due to the memory wall, which has become a bottleneck in many-core processors. In this paper, we present a novel heterogeneous many-core processor architecture named deeply fused many-core (DFMC) for high performance computing systems. DFMC integrates management processing ele- ments (MPEs) and computing processing elements (CPEs), which are heterogeneous processor cores for different application features with a unified ISA (instruction set architecture), a unified execution model, and share-memory that supports cache coherence. The DFMC processor can alleviate the memory wall problem by combining a series of cooperative computing techniques of CPEs, such as multi-pattern data stream transfer, efficient register-level communication mechanism, and fast hardware synchronization technique. These techniques are able to improve on-chip data reuse and optimize memory access performance. This paper illustrates an implementation of a full system prototype based on FPGA with four MPEs and 256 CPEs. Our experimental results show that the effect of the cooperative computing techniques of CPEs is significant, with DGEMM (double-precision matrix multiplication) achieving an efficiency of 94%, FFT (fast Fourier transform) obtaining a performance of 207 GFLOPS and FDTD (finite-difference time-domain) obtaining a performance of 27 GFLOPS. 展开更多
关键词 heterogeneous many-core processor data stream transfer register-level communication mechanism hardwaresynchronization technique processor prototype
原文传递
申威处理器上数据流运行时系统的设计与实现
14
作者 张鹏飞 陈俊仕 +3 位作者 郑重 沈沛祺 安虹 许乐 《计算机工程》 CAS CSCD 北大核心 2023年第12期46-54,共9页
我国自主研发的新一代神威异构众核计算平台主要采用athread异构编程方法,athread异构编程属于大同步并行模型,难以充分挖掘程序中的细粒度并行性,其采用的同步方式难以实现众核上的任务负载均衡。数据流并行编程模型因其天然并行性、... 我国自主研发的新一代神威异构众核计算平台主要采用athread异构编程方法,athread异构编程属于大同步并行模型,难以充分挖掘程序中的细粒度并行性,其采用的同步方式难以实现众核上的任务负载均衡。数据流并行编程模型因其天然并行性、点对点同步的特点能够很好地解决上述问题。基于Codelet程序执行模型和申威主从核架构特点,设计并实现面向申威处理器的数据流运行时系统swTasklet,通过对Codelet功能的进一步细化和对Codelet机器模型到主从核的映射,避免从核阵列上的同步操作,减少同步开销;由主核完成从核计算任务的调度分配,将计算和同步操作分离,保证运行时系统可以和从核计算库的共用。实验以NPB LU程序和向量-向量加作为测试用例,采用相同的优化方法分别对swTasklet和athread实现进行并行化。实验结果表明:在规模较大情况下,LU程序的swTasklet实现版本比athread版本快16%,向量-向量加swTasklet实现版本比athread版本快1倍;使用swTasklet实现的LU并行版本较主核本取得了平均8倍以上的加速,而向量-向量加swTasklet版本较主核版本取得30倍左右的加速。 展开更多
关键词 申威异构处理器 数据流运行时系统 Codelet程序执行模型 并行编程模型 众核加速
下载PDF
嵌入式异构智能计算系统并行多流水线设计
15
作者 赵二虎 吴济文 +2 位作者 肖思莹 晋振杰 徐勇军 《电子学报》 EI CAS CSCD 北大核心 2023年第11期3354-3364,共11页
嵌入式智能计算系统因其功耗受限和多传感器实时智能处理需要,对硬件平台的智能算力能效比和智能计算业务并行度提出了严峻挑战.传统嵌入式计算系统常采用的DSP+FPGA数字信号处理架构,无法适用于多个神经网络模型加速场景.本文基于ARM+D... 嵌入式智能计算系统因其功耗受限和多传感器实时智能处理需要,对硬件平台的智能算力能效比和智能计算业务并行度提出了严峻挑战.传统嵌入式计算系统常采用的DSP+FPGA数字信号处理架构,无法适用于多个神经网络模型加速场景.本文基于ARM+DLP+SRIO嵌入式异构智能计算架构,利用智能处理器多片多核多内存通道特性,提出了并行多流水线设计方法.该方法充分考虑智能计算业务中数据传输、拷贝、推理、结果反馈等环节时间开销,为不同的神经网络模型合理分配智能算力资源,以达到最大的端到端智能计算业务吞吐率.实验结果表明,采用并行多流水线设计方法的深度学习处理器利用率较单流水线平均提高约25.2%,较无流水线平均提高约30.7%,满足可见光、红外、SAR等多模图像实时智能处理需求,具有实际应用价值. 展开更多
关键词 嵌入式智能计算系统 异构计算架构 神经网络模型 并行多流水线 深度学习处理器
下载PDF
失速告警系统应用异构双核处理器的安全性分析研究
16
作者 宣晓刚 魏璐达 +2 位作者 贾少龙 杨飞 张美仙 《计算机测量与控制》 2023年第6期137-142,共6页
飞机失速会影响飞机的飞行安全,失速告警计算机作为失速告警系统的核心控制部件,在失速发生前通过灯光告警、语音告警、振杆器抖动等方式为飞行员提供告警,提醒驾驶员进行操作,避免飞机进入失速状态;按照SAE ARP4754A中研制保证等级的分... 飞机失速会影响飞机的飞行安全,失速告警计算机作为失速告警系统的核心控制部件,在失速发生前通过灯光告警、语音告警、振杆器抖动等方式为飞行员提供告警,提醒驾驶员进行操作,避免飞机进入失速状态;按照SAE ARP4754A中研制保证等级的分类,将失速告警计算机某些功能确定为灾难级,确定其研制保证等级定级为A类;采用异构双核处理器进行失速告警计算机的设计,由于ARP4761中的分析方法对相似性设计有着复杂性和难以模拟仿真的问题,故文章设计参照了IEC 61508参考标准,对采用异构双核处理器的失速告警计算机的安全性能进行了梳理和分析;分析研究结果表明,相比于采用传统的单核处理器或同构双核处理器设计的失速告警计算机,选择异构双核处理器进行失速告警计算机的设计有其独有的优势,其优势在于异构双核处理器所具备的“1oo2D”结构,通过计算分析满足失速告警计算机对于高安全性、高可靠性的要求;依照IEC 61508相关标准,结合失速告警计算机的高性能要求,选择正确的分析设计路径,可以确保失速告警计算机的功能安全完整性等级有效达成,为其他航空产品的设计开发提供参考。 展开更多
关键词 飞机失速 失速告警系统 失速告警计算机 异构双核处理器 IEC 61508 ARP 4761 安全性分析
下载PDF
A perceptual and predictive batch-processing memory scheduling strategy for a CPU-GPU heterogeneous system
17
作者 Juan FANG Sheng LIN +2 位作者 Huijing YANG Yixiang XU Xing SU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第7期994-1006,共13页
When multiple central processing unit(CPU)cores and integrated graphics processing units(GPUs)share off-chip main memory,CPU and GPU applications compete for the critical memory resource.This causes serious resource c... When multiple central processing unit(CPU)cores and integrated graphics processing units(GPUs)share off-chip main memory,CPU and GPU applications compete for the critical memory resource.This causes serious resource competition and has a negative impact on the overall performance of the system.We describe the competition for shared-memory resources in a CPU-GPU heterogeneous multi-core architecture,and a sharedmemory request scheduling strategy based on perceptual and predictive batch-processing is proposed.By sensing the CPU and GPU memory request conditions in the request buffer,the proposed scheduling strategy estimates the GPU latency tolerance and reduces mutual interference between CPU and GPU by processing CPU or GPU memory requests in batches.According to the simulation results,the scheduling strategy improves CPU performance by8.53%and reduces mutual interference by 10.38%with low hardware complexity. 展开更多
关键词 CPU-GPU heterogeneous multi-core Unified memory Access scheduling
原文传递
System Architecture of Godson-3 Multi-Core Processors 被引量:7
18
作者 高翔 陈云霁 +2 位作者 王焕东 唐丹 胡伟武 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第2期181-191,共11页
Godson-3 is the latest generation of Godson microprocessor family. It takes a scalable multi-core architecture with hardware support for accelerating applications including X86 emulation and signal processing. This pa... Godson-3 is the latest generation of Godson microprocessor family. It takes a scalable multi-core architecture with hardware support for accelerating applications including X86 emulation and signal processing. This paper introduces the system architecture of Godson-3 from various aspects including system scalability, organization of memory hierarchy, network-on-chip, inter-chip connection and I/O subsystem. 展开更多
关键词 multi-core processor scalable interconnection cache coherent non-uniform memory access/non-uniform cache access (CC-NUMA/NUCA) MESH CROSSBAR cache coherence reliability availability and serviceability (RAS)
原文传递
Parallel computing of discrete element method on multi-core processors 被引量:6
19
作者 Yusuke Shigeto Mikio Sakai 《Particuology》 SCIE EI CAS CSCD 2011年第4期398-405,共8页
This paper describes parallel simulation techniques for the discrete element method (DEM) on multi-core processors. Recently, multi-core CPU and GPU processors have attracted much attention in accelerating computer ... This paper describes parallel simulation techniques for the discrete element method (DEM) on multi-core processors. Recently, multi-core CPU and GPU processors have attracted much attention in accelerating computer simulations in various fields. We propose a new algorithm for multi-thread parallel computation of DEM, which makes effective use of the available memory and accelerates the computation. This study shows that memory usage is drastically reduced by using this algorithm. To show the practical use of DEM in industry, a large-scale powder system is simulated with a complicated drive unit. We compared the performance of the simulation between the latest GPU and CPU processors with optimized programs for each processor. The results show that the difference in performance is not substantial when using either GPUs or CPUs with a multi-thread parallel algorithm. In addition, DEM algorithm is shown to have high scalabilitv in a multi-thread parallel computation on a CPU. 展开更多
关键词 Discrete element method Parallel computing multi-core processor GPGPU
原文传递
基于异构处理器的新能源汽车车载终端的研究与设计
20
作者 刘佳 徐洋 戈学凯 《汽车零部件》 2023年第10期79-83,共5页
基于异构处理器STM32MP157设计新能源汽车车载终端,并搭载4G无线网络模块,实现数据的远程无线传输。文中异构处理器核心按功能定义可划分为实时核心和应用核心。其中,实时核心的主要功能是采集控制器局域网总线(controller area network... 基于异构处理器STM32MP157设计新能源汽车车载终端,并搭载4G无线网络模块,实现数据的远程无线传输。文中异构处理器核心按功能定义可划分为实时核心和应用核心。其中,实时核心的主要功能是采集控制器局域网总线(controller area network,CAN)网络上的原始CAN报文数据,应用核心主要负责CAN报文的解析和上传工作,实时核心和应用核心之间通过处理器内部核间通信控制器、硬件信号量及共享内存等硬件外设实现不同核心之间的数据同步和交换。利用设计的新能源汽车车载终端,可实现对新能源汽车整车生命周期的管理和监控。 展开更多
关键词 车载终端 异构处理器 CAN总线 4G网络
下载PDF
上一页 1 2 8 下一页 到第
使用帮助 返回顶部