期刊文献+

LSCSIMD配置存储器组织及管理算法研究

Research of CRAM's Architecture and Scheduling Algorithm of LS CSIMD
下载PDF
导出
摘要 通过对LS CSIMD体系结构的深入研究,提出了一种支持扩展配置存储器寻址空间的配置存储器组织结构(配置指令采用立即数寻址),该结构通过增加配置存储器容量,可以有效地降低配置延迟.同时,针对LS CSIMD配置存储器的组织结构,提出了一种基于任务核频率和容量的配置数据管理算法.该算法根据配置任务序列特征,动态产生静态区容量,并根据任务核频率和容量进行调度,有效地减少了重复取入片上的配置数据总量,从而降低了配置延迟.实验结果表明,该算法不但时间复杂度低(最好情况下O(n)),而且当任务核数较多且任务核之间容量相差较小时可得最优解,其他情况下可得到次优解.  Dynamically reconfigurable architectures are emerging as a viable design alternative to implement wide range of computationally intensive applications. The architecture of configuration memory and context (configuration) management are very critical issues in achieving the high performance of dynamically reconfigurable architectures. An architecture of configuration memory (CRAM) that can be extended accessing space (configuration instructions adopt immediate addressing mode) is proposed by deeply researching on the architecture of LS CSIMD in this paper. This architecture of CRAM, which has multiple contexts, can greatly reduce reconfiguration latency by increasing capacity of CRAM. For CRAM' s architecture, a novel scheduling algorithm based on the frequency and capacity of task kernels is proposed in subtopic 3. According to the character of reconfigurable task sequence, this algorithm can dynamically compute the capacity of the static region CRAM, and then the algorithm decides the locations of task kernels in static or dynamic region. So, the amount of configuration data loaded repeatedly into the CRAM is effectively decreased, and reconfiguration latency is reduced. The experimental results (subtopic 4) show that the time complexity of the scheduling algorithm is about O ( n ) in best condition and the time complexity is about O ( n · log2 n ) in other conditions. When there are more task kernels and less difference among kernels capacities, the results are optimal.
出处 《计算机研究与发展》 EI CSCD 北大核心 2007年第6期1080-1087,共8页 Journal of Computer Research and Development
基金 国家"八六三"高技术研究发展计划基金项目(2002AA714022)~~
关键词 LS CSIMD 可重构计算 配置延迟 调度算法 动态可重构 LS CSIMD reconfigurable computing reconfiguration latency scheduling algorithm dynamical reconfiguration
  • 相关文献

参考文献11

  • 1S Brown,J Rose.Architecture of FPGAs and CLPDs:A tutorial[J].IEEE Design and Test of Computer,1996,13 (2):42-57
  • 2E Tau,D Chela,et al.A first generation DPGA implementation[C].Canadian Workshop of Field Programmable Devices(FDP'95),Montreal,1995
  • 3Timothy J Callahan,John R Hauser,John Wawrzynek.The Garp architecture and C compiler[J].IEEE Computer,2000,33(4):40-45
  • 4H Singh,Ming2Hau Lee,et al.MorphoSys:An integrated reconfigurable system for data-parallel and computation-intensive applicatons[J].IEEE Trans on Computers,2000,49(5):465-481
  • 5Zhiyuan Li.Configuration management techniques for reconfigurable computing:[Ph D dissertation][D].Evanston,Illinois:North Western University,2002
  • 6S Hauck.Configuration prefetch for single context reconfigurable coprocessors[C],ACM/SIGDA Int'l Symp on Field-Programmable Gate Arrays,New York,1998
  • 7L A Belady.A study of replacement algorithms for virtual storage computers[J].IBM Systems Journal,1966,5(2):78-101
  • 8Rafael Maestre,et al.A framework for reconfigurable computing:Task scheduling and context management[J].IEEE Trans on Very Large Scale Integration(VLSI)Systems,2001,9(6):1-7
  • 9R Maestre,et al.A framework for scheduling and context allocation in reconfigurable computing[J].Proceedings Int'lSymp on System Synthesis(ISSS'99),1999,2(3):134-140
  • 10冯圣中,谭光明,徐琳,孙凝晖,徐志伟.曙光4000H生物信息处理专用计算机的高性能算法研究[J].计算机研究与发展,2005,42(6):1053-1058. 被引量:3

二级参考文献15

  • 1D. Gusfield. Algorithms on Strings, Trees, and Sequences:Computer Science and Computational Biology. Cambridge:Cambridge University Press, 2002
  • 2R.D. Bjornson, et al. TurboBLAST: A parallel implementation of BLAST built on the TurboHub. The 1st Int'l Workshop on High Performance Computational Biology (HiCOMB2002),Marriott Marina, Fort Laudedale, Florida, 2002
  • 3S.F. Altschul, et al. Basic Local Alignment Search Tool. J.Mol. Biol, 1990, 215(3): 403~410
  • 4M. Zuker. The use of dynamic programming algorithms in RNA secondary structure prediction. In: M. S. Waterman ed.Mathematical Methods for DNA Sequences. Boca Raton, FL:CRC Press, Inc., 1989. 159~184
  • 5A. E. Darling, et al. The design, implementation, and evaluation of mpiBlast. ClusterWorld Conf. & Expo and the 4th Int'l Conf. Linux Clusters: The HPC Revolution, San Jose,California, 2003
  • 6G.M. Tan, Y. Y. Zhou, S. Z. Feng, et al. Reconfigurable computing system and performance analysis model. WSEAS Trans. Systems, 2004, 3(7): 2567~2573
  • 7R. L. Rivest, A. Shamir, L. Adleman. A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM, 1978, 21(2):120-126
  • 8C. W. Chiou. Parallel implementation of the RSA public-key cryptosystem. International Journal of Computer Mathematics,1993, 48(3-4) : 153-155
  • 9P. L. Montgomery. Modular multiplication without trial division.Mathematics of Computation, 1985, 44( 170): 519-521
  • 10C. K. Koc, A. F. Tenca. A word-based algorithm and architecture for Montgomery multiplication. In: C. Paar, C.Koc, eds. Cryptographic Hardware and Embedded Systems,Lecture Notes in Computer Science 1717. Berlin: Springer,1999. 99 - 108

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部