期刊文献+
共找到58篇文章
< 1 2 3 >
每页显示 20 50 100
Timed Petri Net Models of Shared-Memory Bus-Based Multiprocessors 被引量:1
1
作者 Wlodek M. Zuberek 《Journal of Computer and Communications》 2018年第10期1-14,共14页
In shared-memory bus-based multiprocessors, when the number of processors grows, the processors spend an increasing amount of time waiting for access to the bus (and shared memory). This contention reduces the perform... In shared-memory bus-based multiprocessors, when the number of processors grows, the processors spend an increasing amount of time waiting for access to the bus (and shared memory). This contention reduces the performance of processors and imposes a limitation of the number of processors that can be used efficiently in bus-based systems. Since the multi-processor’s performance depends upon many parameters which affect the performance in different ways, timed Petri nets are used to model shared-memory bus-based multiprocessors at the instruction execution level, and the developed models are used to study how the performance of processors changes with the number of processors in the system. The results illustrate very well the restriction on the number of processors imposed by the shared bus. All performance characteristics presented in this paper are obtained by discrete-event simulation of Petri net models. 展开更多
关键词 SHARED-MEMORY multiprocessors BUS-BASED multiprocessors TIMED PETRI NETS Discrete-Event Simulation
下载PDF
Fork-Join program response time on multiprocessors with exchangeable join 被引量:1
2
作者 WANG Yong-cai ZHAO Qian-chuan ZHENG Da-zhong 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第6期927-936,共10页
The Fork-Join program consisting of K parallel tasks is a useful model for a large number of computing applications. When the parallel processor has multi-channels, later tasks may finish execution earlier than their ... The Fork-Join program consisting of K parallel tasks is a useful model for a large number of computing applications. When the parallel processor has multi-channels, later tasks may finish execution earlier than their earlier tasks and may join with tasks from other programs. This phenomenon is called exchangeable join (EJ), which introduces correlation to the task’s service time. In this work, we investigate the response time of multiprocessor systems with EJ with a new approach. We analyze two aspects of this kind of systems: exchangeable join (EJ) and the capacity constraint (CC). We prove that the system response time can be effectively reduced by EJ, while the reduced amount is constrained by the capacity of the multiprocessor. An upper bound model is constructed based on this analysis and a quick estimation algorithm is proposed. The approximation formula is verified by extensive simulation results, which show that the relative error of approximation is less than 5%. 展开更多
关键词 Exchangeable join First come first served (FCFS) Fork-Join MULTIPROCESSOR Response time
下载PDF
Design of efficient parallel algorithms on shared memory multiprocessors
3
作者 Qiao Xiangzhen (Institute of Computing Technology, Chinese Academg of Science Beijing 100080, P. R. China) 《Wuhan University Journal of Natural Sciences》 CAS 1996年第Z1期344-349,共6页
The design of parallel algorithms is studied in this paper. These algorithms are applicable to shared memory MIMD machines In this paper, the emphasis is put on the methods for design of the efficient parallel algori... The design of parallel algorithms is studied in this paper. These algorithms are applicable to shared memory MIMD machines In this paper, the emphasis is put on the methods for design of the efficient parallel algorithms. The design of efficient parallel algorithms should be based on the following considerationst algorithm parallelism and the hardware-parallelism; granularity of the parallel algorithm, algorithm optimization according to the underling parallel machine. In this paper , these principles are applied to solve a model problem of the PDE. The speedup of the new method is high. The results were tested and evaluated on a shared memory MIMD machine. The practical results were agree with the predicted performance. 展开更多
关键词 parallel algorithm shared memory multiprocessor parallel granularity optimization.
下载PDF
A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs 被引量:1
4
作者 Maggie Swetha Thota 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第3期434-447,共14页
Significant advances in field-programmable gate arrays (FPGAs) have made it viable to explore innovative multiprocessor solutions on a single FPGA chip. For multiprocessors, an efficient communication network that m... Significant advances in field-programmable gate arrays (FPGAs) have made it viable to explore innovative multiprocessor solutions on a single FPGA chip. For multiprocessors, an efficient communication network that matches the needs of the target application is always critical to the overall performance. Wormhole packet-switching network-on-chip (NoC) solutions are replacing conventional shared buses to deal with scalability and complexity challenges coming along with the increasing number of processing elements (PEs). However, the quest for high performance networks has led to very complex and resource-expensive NoC designs, leaving little room for the real computing force, i.e., PEs. Moreover, many techniques offer very small performance gains or none at all when network traffic is light while increasing the resource usage of routers. We argue that computation is still the primary task of multiprocessors and sufficient resources should be reserved for PEs. This paper presents our novel design and implementation of a resource-efficient communication network for multiprocessors on FPGAs. We reduce not only the required number of routers for a given number of PEs by introducing a new PE-router topology, but also the resource requirement of each router. Our communication network relies on the NEWS channels to transfer packets in a pipelined fashion following the path determined by the routing network, The implementation results on various Xilinx FPGAs show good performance in the typical range of network load for multiprocessor applications. 展开更多
关键词 chip multiprocessors FPGA network on chip mesh topology resource efficient
原文传递
Efficient Handling of Lock Hand-off in DSM Multiprocessors with Buffering Coherence Controllers 被引量:1
5
作者 Benjamín Sahelices Agustín de Dios +2 位作者 Pablo Ibáez Víctor Vials-Yúfera José María Llabería 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第1期75-91,共17页
Synchronization in parallel programs is a major performance bottleneck in multiprocessor systems. Shared data is protected by locks and a lot of time is spent on the competition arising at the lock hand-off. In order ... Synchronization in parallel programs is a major performance bottleneck in multiprocessor systems. Shared data is protected by locks and a lot of time is spent on the competition arising at the lock hand-off. In order to be serialized, requests to the same cache line can either be bounced (NACKed) or buffered in the coherence controller. In this paper, we focus mainly on systems whose coherence controllers buffer requests. In a lock hand-off, a burst of requests to the same line arrive at the coherence controller. During lock hand-off only the requests from the winning processor contribute to progress of the computation, since the winning processor is the only one that will advance the work. This key observation leads us to propose a hardware mechanism we call request bypassing, which allows requests from the winning processor to bypass the requests buffered in the coherence controller keeping the lock line. We present an inexpensive implementation of request bypassing that reduces the time spent on all the execution phases of a critical section (acquiring the lock, accessing shared data, and releasing the lock) and which, as a consequence, speeds up the whole parallel computation. This mechanism requires neither compiler or programmer support nor ISA or coherence protocol changes. By simulating a 32-processor system, we show that using request bypassing does not degrade but rather improves performance in three applications with low synchronization rates, while in those having a large amount of synchronization activity (the remaining four), we see reductions in execution time and in lock stall time ranging from 14% to 39% and from 52% to 7170, respectively. We compare request bypassing with a previously proposed technique called read combining and with a system that bounces requests, observing a significantly lower execution time with the bypassing scheme. Finally, we analyze the sensitivity of our results to some key hardware and software parameters. 展开更多
关键词 distributed shared memory multiprocessors synchronization buffer coherence controller request bypass
原文传递
Granularity Analysis for Exploiting Adaptive Parallelism of Declarative Programs on Multiprocessors
6
作者 田新民 王鼎兴 +2 位作者 沈美明 郑纬民 温冬婵 《Journal of Computer Science & Technology》 SCIE EI CSCD 1994年第2期144-152,共9页
Declarative Programming Languages (DPLs) apply a process model of Horn claun es such as PARLOG[8] or a reduction model of A-calculus such as SML[7] and are) in principle, well suited to multiprocessor implemelltation.... Declarative Programming Languages (DPLs) apply a process model of Horn claun es such as PARLOG[8] or a reduction model of A-calculus such as SML[7] and are) in principle, well suited to multiprocessor implemelltation. However, the performance of a parallel declarative program can be impaired by a mismatch between the parallelism available in an application and the parallelism available in the architecture. A particularly attractive solution is to automatically match the parallelism of the program to the parallelism of the target hardware as a compilation step. In this paper) we present an optimizillg compilation technique called granularity analysis which identi fies and removes excess parallelism that would degrade performance. The main steps are: an analysis of the flow of data to form an attributed call graph between function (or predicate) arguments; and an asymptotic estimation of granularity of a function (or predicate) to generate approximate grain size. Compiled procedure calls can be annotated with grain size and a task scheduler can make scheduling decisions with the classilication scheme of grains to control parallelism at runtime. The resulting granularity analysis scheme is suitable for exploiting adaptive parallelism of declarative programming languages on multiprocessors. 展开更多
关键词 Granularity analysis adaptive parallelism declarative languages grain coalescing grain classification fine-grained tasks coarse-grained tasks multiprocessors
原文传递
Compiling CIL Rewriting Language for Multiprocessors
7
作者 田新民 王鼎兴 +2 位作者 郑纬民 沈美明 李程 《Journal of Computer Science & Technology》 SCIE EI CSCD 1994年第4期302-310,共9页
The highlevel Compiler Intermediate Language CIL is a generalpurpose descripotion language of parallel graph rewriting computational model intended for parallelimplementatioll of declarative languages on multiprocesso... The highlevel Compiler Intermediate Language CIL is a generalpurpose descripotion language of parallel graph rewriting computational model intended for parallelimplementatioll of declarative languages on multiprocessor systems. In this paper, wefirst outline a new Hybrid Execution Model(HEM) and corresponding parallel abstract machine PAM/TGR based on Extended parallel Graph Rewriting ComputationalModel EGRCM for implementing CIL language on distributed memory multiprocessorsystems. Then we focus on the compiling CIL language with various optindzing techniques such as pattern matching, rule indexing, node ordering and compile-time partialscheduling. The experimental results on a 16-node Thansputer Array demonstrates the effectiveness of our model and strategies. 展开更多
关键词 Graph rewriting rewriting language optimizing compiler pattern matching rule indexing rule-group compiling multiprocessors
原文传递
Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors
8
作者 沈美明 田新民 +2 位作者 王鼎兴 郑纬民 温冬婵 《Journal of Computer Science & Technology》 SCIE EI CSCD 1993年第3期233-242,共10页
In this paper,we focus on the compiling implementation of parallel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel G... In this paper,we focus on the compiling implementation of parallel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstract machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed. The performance statistics on a transputer array demonstrate the effectiveness of our model,parallel ab- stract machine,optimizing compilation strategies and compiler. 展开更多
关键词 Declarative language parallel graph rewriting execution model optimized parallel compiler distributed memory multiprocessors parallel abstract machine
原文传递
A Comparative Analysis of Different Arbitration Protocols forMultiple-Bus Multiprocessors
9
作者 庄旗铭 蒋定安 杨庆 《Journal of Computer Science & Technology》 SCIE EI CSCD 1996年第3期313-325,共13页
Earlier performance studies of multiple-bus multiprocessor systetns assume a ran-dom selection of competing requests for bus assignment and ignore the effects of realistic bus arbitration schemes on the performance of... Earlier performance studies of multiple-bus multiprocessor systetns assume a ran-dom selection of competing requests for bus assignment and ignore the effects of realistic bus arbitration schemes on the performance of such systetns. In this paper, we present performance analysis of the multiple-bus systems with different arbitration protocols.The priority protocols considered are random selection, fixed priority, rotating priority, roundrobin and FIFO. Analytica1 models are developed for each of these five dmerent priority protocoIs. Each of our analyses modeIs exactly the behavior of the correspond-ing priority protocol with little computation cost. The analytical models are validated through extensive simu1ations and are then used to carry out performance analysis and comparison of different priority protocols. Numerical results obtained from our models show that the round-robin protocol performs the best among the five protocols in the system with a few buses. 展开更多
关键词 MIP PP A Comparative Analysis of Different Arbitration Protocols forMultiple-Bus multiprocessors
原文传递
A Hierarchical Reconfiguration Strategy for Bus-Based Multiprocessors
10
作者 Ashish Pancholy Fidel Muradali Vinod K.Agarwal 《Journal of Computer Science & Technology》 SCIE EI CSCD 1990年第2期175-186,共12页
A method of providing redundancy to a class of bus-based multiprocessor arrays is discussed.The reconfiguration is hierarchical,providing global spare replacement at the array level and local reconfiguration within th... A method of providing redundancy to a class of bus-based multiprocessor arrays is discussed.The reconfiguration is hierarchical,providing global spare replacement at the array level and local reconfiguration within the spare block.Results of yield analysis performed on a 32 processor array are pres- ented. 展开更多
关键词 A Hierarchical Reconfiguration Strategy for Bus-Based multiprocessors PE
原文传递
Global Register Allocation for SIMD Multiprocessors
11
作者 BenjaminHAO DavidPEARSON 《Journal of Computer Science & Technology》 SCIE EI CSCD 1996年第3期222-236,共15页
It is relatively clear how to map regular, repetitive or grid oriented computations onto SIMD architectures. It is not so clear, however, how to do this for irregular computations even though there may be significant ... It is relatively clear how to map regular, repetitive or grid oriented computations onto SIMD architectures. It is not so clear, however, how to do this for irregular computations even though there may be significant amounts of intrinsic parallelism in branch free code. We study compilation techniques for this type of code when targeted to SIMD computers and illustrate their use on a simple model architecture.In this paper, we present one of the compilation techniques, global mpister allocation,we have developed for SIMD computers, and demonstrate that it can effectively allocate registers for parallelizing irregular computations in branch free code. This technique is an extension and a modification of the register allocation via graph coloring approach used by sequential compilers. Our performance results validate our method. 展开更多
关键词 Global Register Allocation for SIMD multiprocessors
原文传递
CCNoC:Cache-Coherent Network on Chip for Chip Multiprocessors 被引量:1
12
作者 王惊雷 薛一波 +4 位作者 Member, CCF, IEEE 王海霞 李崇民 汪东升 Senior Member,CCF 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第2期257-266,共10页
As the number of cores in chip multiprocessors (CMPs) increases, cache coherence protocol has become a key issue in integration of chip multiprocessors. Supporting cache coherence protocol in large chip multiprocess... As the number of cores in chip multiprocessors (CMPs) increases, cache coherence protocol has become a key issue in integration of chip multiprocessors. Supporting cache coherence protocol in large chip multiprocessors still faces three hurdles: design complexity, performance and scalability. This paper proposes Cache Coherent Network on Chip (CCNoC), a scheme that decouples cache coherency maintenance from processors and shared L2 caches and implements it completely in network on chip to free up processors and shared L2 caches from the chore of maintaining coherency, thereby reduces design complexity of CMPs. In this way, CCNoC also improves the performance of cache coherence protocol through reducing directory access latency and enhances scalability by avoiding massive directories overhead in shared L2 caches. In CCNoC, coherence state caches and active directory caches are implemented in the network interface components of network on chip to maintain cache coherence states for blocks in L1 caches and manage directory information for recently accessed blocks in L2 caches respectively. CCNoC provides a scalable CMP framework to tackle cache coherency which is the foundation of CMP. This paper evaluates the performance of CCNoC. Experimental results show that for a 16-core system, CCNoC improves performance by 3% on average over the conventional chip multiprocessor and by 10% at best, while reduces storage overhead by 1.8% and saves directory storage by 88%, showing good scalability. 展开更多
关键词 chip multiprocessor network on chip cache coherence protocol
原文传递
Energy-Efficient Scheduling Based on Task Migration Policy Using DPM for Homogeneous MPSoCs
13
作者 Hamayun Khan Irfan Ud din +1 位作者 Arshad Ali Sami Alshmrany 《Computers, Materials & Continua》 SCIE EI 2023年第1期965-981,共17页
Increasing the life span and efficiency of Multiprocessor System on Chip(MPSoC)by reducing power and energy utilization has become a critical chip design challenge for multiprocessor systems.With the advancement of te... Increasing the life span and efficiency of Multiprocessor System on Chip(MPSoC)by reducing power and energy utilization has become a critical chip design challenge for multiprocessor systems.With the advancement of technology,the performance management of central processing unit(CPU)is changing.Power densities and thermal effects are quickly increasing in multi-core embedded technologies due to shrinking of chip size.When energy consumption reaches a threshold that creates a delay in complementary metal oxide semiconductor(CMOS)circuits and reduces the speed by 10%–15%because excessive on-chip temperature shortens the chip’s life cycle.In this paper,we address the scheduling&energy utilization problem by introducing and evaluating an optimal energy-aware earliest deadline first scheduling(EA-EDF)based technique formultiprocessor environments with task migration that enhances the performance and efficiency in multiprocessor systemon-chip while lowering energy and power consumption.The selection of core andmigration of tasks prevents the system from reaching itsmaximumenergy utilization while effectively using the dynamic power management(DPM)policy.Increase in the execution of tasks the temperature and utilization factor(u_(i))on-chip increases that dissipate more power.The proposed approach migrates such tasks to the core that produces less heat and consumes less power by distributing the load on other cores to lower the temperature and optimizes the duration of idle and sleep times across multiple CPUs.The performance of the EA-EDF algorithm was evaluated by an extensive set of experiments,where excellent results were reported when compared to other current techniques,the efficacy of the proposed methodology reduces the power and energy consumption by 4.3%–4.7%on a utilization of 6%,36%&46%at 520&624 MHz operating frequency when particularly in comparison to other energy-aware methods for MPSoCs.Tasks are running and accurately scheduled to make an energy-efficient processor by controlling and managing the thermal effects on-chip and optimizing the energy consumption of MPSoCs. 展开更多
关键词 Dynamic power management dynamic voltage&frequency scaling dynamic thermal management multiprocessor system on chip complementary metal oxide semiconductor reliability
下载PDF
Design of ANN Based Non-Linear Network Using Interconnection of Parallel Processor
14
作者 Anjani Kumar Singha Swaleha Zubair +3 位作者 Areej Malibari Nitish Pathak Shabana Urooj Neelam Sharma 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3491-3508,共18页
Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong inte... Suspicious mass traffic constantly evolves,making network behaviour tracing and structure more complex.Neural networks yield promising results by considering a sufficient number of processing elements with strong interconnections between them.They offer efficient computational Hopfield neural networks models and optimization constraints used by undergoing a good amount of parallelism to yield optimal results.Artificial neural network(ANN)offers optimal solutions in classifying and clustering the various reels of data,and the results obtained purely depend on identifying a problem.In this research work,the design of optimized applications is presented in an organized manner.In addition,this research work examines theoretical approaches to achieving optimized results using ANN.It mainly focuses on designing rules.The optimizing design approach of neural networks analyzes the internal process of the neural networks.Practices in developing the network are based on the interconnections among the hidden nodes and their learning parameters.The methodology is proven best for nonlinear resource allocation problems with a suitable design and complex issues.The ANN proposed here considers more or less 46k nodes hidden inside 49 million connections employed on full-fledged parallel processors.The proposed ANN offered optimal results in real-world application problems,and the results were obtained using MATLAB. 展开更多
关键词 Artificial neural network(ANN) MULTIPROCESSOR hidden node nonlinear optimization parallel processing
下载PDF
An Optimal DPM Based Energy-Aware Task Scheduling for Performance Enhancement in Embedded MPSoC
15
作者 Hamayun Khan Irfan Ud Din +1 位作者 Arshad Ali Mohammad Husain 《Computers, Materials & Continua》 SCIE EI 2023年第1期2097-2113,共17页
Minimizing the energy consumption to increase the life span and performance of multiprocessor system on chip(MPSoC)has become an integral chip design issue for multiprocessor systems.The performance measurement of com... Minimizing the energy consumption to increase the life span and performance of multiprocessor system on chip(MPSoC)has become an integral chip design issue for multiprocessor systems.The performance measurement of computational systems is changing with the advancement in technology.Due to shrinking and smaller chip size power densities onchip are increasing rapidly that increasing chip temperature in multi-core embedded technologies.The operating speed of the device decreases when power consumption reaches a threshold that causes a delay in complementary metal oxide semiconductor(CMOS)circuits because high on-chip temperature adversely affects the life span of the chip.In this paper an energy-aware dynamic power management technique based on energy aware earliest deadline first(EA-EDF)scheduling is proposed for improving the performance and reliability by reducing energy and power consumption in the system on chip(SOC).Dynamic power management(DPM)enables MPSOC to reduce power and energy consumption by adopting a suitable core configuration for task migration.Task migration avoids peak temperature values in the multicore system.High utilization factor(ui)on central processing unit(CPU)core consumes more energy and increases the temperature on-chip.Our technique switches the core bymigrating such task to a core that has less temperature and is in a low power state.The proposed EA-EDF scheduling technique migrates load on different cores to attain stability in temperature among multiple cores of the CPU and optimized the duration of the idle and sleep periods to enable the low-temperature core.The effectiveness of the EA-EDF approach reduces the utilization and energy consumption compared to other existing methods and works.The simulation results show the improvement in performance by optimizing 4.8%on u_(i) 9%,16%,23%and 25%at 520 MHz operating frequency as compared to other energy-aware techniques for MPSoCs when the least number of tasks is in running state and can schedule more tasks to make an energy-efficient processor by controlling and managing the energy consumption of MPSoC. 展开更多
关键词 Dynamic power management dynamic voltage&frequency scaling dynamic thermal management multiprocessor system on chip complementary metal oxide semiconductor reliability
下载PDF
一种分片式多核处理器的用户级模拟器 被引量:6
16
作者 黄琨 马可 +2 位作者 曾洪博 张戈 章隆兵 《软件学报》 EI CSCD 北大核心 2008年第4期1069-1080,共12页
随着片上晶体管资源的增多和互连线延迟的加大,分片式多核微处理器已成为多核处理器设计的新方向.为了对这种新型处理器进行体系结构的深入研究和设计空间的探索,设计并实现了针对分片式多核处理器的用户级多核性能模拟器.该多核模拟器... 随着片上晶体管资源的增多和互连线延迟的加大,分片式多核微处理器已成为多核处理器设计的新方向.为了对这种新型处理器进行体系结构的深入研究和设计空间的探索,设计并实现了针对分片式多核处理器的用户级多核性能模拟器.该多核模拟器在龙芯2号单处理器核的基础上,完整地模拟了基于目录的Cache一致性协议和存储转发式片上互联网络的结构模型,详细地刻画了由于系统乱序处理各种请求应答和请求之间的冲突而造成的时序特性,可以通过运行各种串行或并行的工作负载对多核处理器的各种重要性能指标加以评估,为多核处理器的结构设计提供了快速、灵活、高效的研究平台. 展开更多
关键词 分片式CMP(chip multiprocessor) 模拟器 片上网络 性能分析 龙芯2号微处理器
下载PDF
一种改进的抗干扰性可预测的协作缓存机制
17
作者 苑风凯 季振洲 《智能计算机与应用》 2017年第6期19-24,共6页
私有Last Level Cache(LLC)配置可以提供更低的访问延迟和更好的性能隔离性,但剔除了共享未利用Cache资源的能力。协作缓存机制通过将本地LLC(溢出者)驱逐的Cache行溢出(Spill)到远程LLC(接收者)达到Cache容量共享的目的。后继的协作缓... 私有Last Level Cache(LLC)配置可以提供更低的访问延迟和更好的性能隔离性,但剔除了共享未利用Cache资源的能力。协作缓存机制通过将本地LLC(溢出者)驱逐的Cache行溢出(Spill)到远程LLC(接收者)达到Cache容量共享的目的。后继的协作缓存研究致力于Cache共享的有效性,然而,没有任何研究关注到接收者对于外来溢出行的抗干扰性。本文提出重用性和抗干扰性可预测的协作缓存(RAPCC)机制,进一步增强私有LLC的Cache容量共享能力。RAPCC借助重用位置分布(RPD)预测私有LLC的重用性和抗干扰性决定其接受溢出角色:溢出者、接收者、NEITHER和EITHER(全新中立的角色,既溢出又接收)。试验结果表明,RAPCC的HPKI和IPC性能表现均超越了CC、DSR、CBS、ASCC。 展开更多
关键词 Chip multiprocessors(CMPs) LAST Level Cache(LLC) 私有LLC 协作缓存
下载PDF
AN OBJECT ORIENTED MODEL SCHEDULING FOR MEDIA-SOC
18
作者 Cheng Xingmei Yao Yingbiao +2 位作者 Zhang Yixiong Liu Peng Yao Qingdong 《Journal of Electronics(China)》 2009年第2期244-251,共8页
This paper proposes an object oriented model scheduling for parallel computing in media MultiProcessors System on Chip(MPSoC).Firstly,the Coarse Grain Data Flow Graph(CGDFG) parallel programming model is used in this ... This paper proposes an object oriented model scheduling for parallel computing in media MultiProcessors System on Chip(MPSoC).Firstly,the Coarse Grain Data Flow Graph(CGDFG) parallel programming model is used in this approach.Secondly,this approach has the feature of unified abstraction for software objects implementing in processor and hardware objects implementing in ASICs,easy for mapping CGDFG programming on MPSoC.This approach cuts down the kernel overhead and reduces the code size effectively.The principle of the oriented object model,the method of scheduling,and how to map a parallel programming through CGDFG to the MPSoC are analyzed in this approach.This approach also compares the code size and execution cycles with conventional control flow scheduling,and presents respective management overhead for one application in me-dia-SoC. 展开更多
关键词 Multimedia computing Coarse Grain Data Flow Graph(CGDFG) Parallel program-ming Real Time Operating System(RTOS) multiprocessors System on Chip(MPSoC)
下载PDF
An Efficient Real-Time Fault-Tolerant Scheduling Algorithm Based on Multiprocessor Systems 被引量:6
19
作者 YANG Fumin LUO Wei PANG Liping 《Wuhan University Journal of Natural Sciences》 CAS 2007年第1期113-116,共4页
In the context of real-time fault-tolerant scheduling in multiprocessor systems, Primary-backup scheme plays an important role. A backup copy is always preferred to be executed as passive backup copy whenever possible... In the context of real-time fault-tolerant scheduling in multiprocessor systems, Primary-backup scheme plays an important role. A backup copy is always preferred to be executed as passive backup copy whenever possible because it can take the advantages of backup copy de-allocation technique and overloading technique to improve schedulability. In this paper, we propose a novel efficient fault-tolerant ratemonotonic best-fit algorithm efficient fault-tolerant rate-monotonic best-fit (ERMBF) based on multiprocessors systems to enhance the schedulability. Unlike existing scheduling algorithms that start scheduling tasks with only one processor. ERMBF pre-allocates a certain amount of processors before starting scheduling tasks, which enlarge the searching spaces for tasks. Besides, when a new processor is allocated, we reassign the task copies that have already been assigned to the existing processors in order to find a superior tasks assignment configuration. These two strategies are all aiming at making as many backup copies as possible to be executed as passive status. As a result, ERMBF can use fewer processors to schedule a set of tasks without losing real-time and fault-tolerant capabilities of the system. Simulation results reveal that ERMBF significantly improves the schedulability over existing, comparable algorithms in literature. 展开更多
关键词 real-time periodic tasks FAULT-TOLERANCE primary/backup copy multiprocessor systems
下载PDF
t/k-fault diagnosis algorithm of n-dimensional hypercube network based on the MM*model 被引量:4
20
作者 LIANG Jiarong ZHOU Ning YUN Long 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2018年第1期216-222,共7页
Compared with accurate diagnosis, the system’s selfdiagnosing capability can be greatly increased through the t/kdiagnosis strategy at most k vertexes to be mistakenly identified as faulty under the comparison model,... Compared with accurate diagnosis, the system’s selfdiagnosing capability can be greatly increased through the t/kdiagnosis strategy at most k vertexes to be mistakenly identified as faulty under the comparison model, where k is typically a small number. Based on the Preparata, Metze, and Chien(PMC)model, the n-dimensional hypercube network is proved to be t/kdiagnosable. In this paper, based on the Maeng and Malek(MM)*model, a novel t/k-fault diagnosis(1≤k≤4) algorithm of ndimensional hypercube, called t/k-MM*-DIAG, is proposed to isolate all faulty processors within the set of nodes, among which the number of fault-free nodes identified wrongly as faulty is at most k. The time complexity in our algorithm is only O(2~n n~2). 展开更多
关键词 hypercube network t/k-diagnosis algorithm multiprocessor systems the Maeng and Malek(MM)* model Preparata Metze and Chien(PMC)
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部