期刊文献+

支持低延迟通信与容错的计算资源共享环境构建 被引量:4

Building computing resource sharing environment with low latency and fault tolerance
下载PDF
导出
摘要 提出与描述了支持低延迟通信与容错的计算资源共享环境LF-CRSE(low latency and fault tolerance CRSE),LF-CRSE提出了节点功能角色的观点,由客户端功能节点、任务服务器、工作机服务提供器、工作机节点组成,形成一个可扩展的分布式网络体系结构。采用了任务缓存、任务预获取和任务服务器端计算等策略保证了通信过程的低延迟开销。在应用上利用分支界限模式的任务划分,使LF-CRSE支持主-从模式和分-治模式的灵活编程模型。通过工作机端的心跳消息和面向子任务的容错方式保证了LF-CRSE的正确性。测试过程选择了具有数据依赖的分布式旅行商问题,实验结果表明,LF-CRSE的加速比随着工作机的增加稳定提高,在低延迟通信和容错特性上也具有良好的性能。 A computing resource sharing environment with low latency and fault tolerance called LF-CRSE is presented and de- scribed. All the nodes in LF-CRSE are designed as a certain role, named client, task sever, worker service provider, worker and thus form a scalable network topology for LF-CRSE. For a parallel application, LF-CRSE can hide communication latency via task cache, task pre-fetching and task server computation policy. These features also enable an elegant expression of branch-and- bound optimization, which is used for the divide-and-conquer computations. LF-CRSE manages a worker processor set which can change during the program execution for reasons that include faulty workers. LF-CRSE is deployed as an experimental platform, with which we have achieved a computation record by solving the TSP (travelling salesman problem). The results obtained from performance analysis show that the speedup of LF-CRSE is increased. Some good performances are also obtained in the low latency and fault tolerance testing.
作者 许爱军 张岳
出处 《计算机工程与设计》 CSCD 北大核心 2012年第4期1352-1356,共5页 Computer Engineering and Design
关键词 分布式计算 计算资源共享 低延迟 容错 分支界限 distributed computing computing resource sharing low latency fault tolerance branch and bound
  • 相关文献

同被引文献34

引证文献4

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部