摘要
数据流编程作为一种编程模式被广泛地应用于多核处理器系统,其多核处理器的并行调度和对主存的访问延迟对程序的性能有很大的影响.为此,结合X86多核处理器的特点,提出一种数据流程序的任务调度与缓存优化方法.任务调度优化首先在预处理阶段提高目标程序的局部性和并行粒度;然后利用数据流程序的数据并行、任务并行和流水并行优化核间负载均衡,并构造软件流水调度.缓存优化针对目标系统的层次性缓存结构特征,通过消除缓存伪共享减少多核并行运行时相互间的干扰,根据逻辑线程间的通信分布实现逻辑线程到处理器核的映射.以COStream作为数据流编程语言,输出经过编译优化后的目标代码.实验选取数字媒体领域典型的算法进行测试,测试结果表明,编译优化后的测试程序基本达到线性加速比,验证了编译系统的有效性.
Stream programming as a kind of programming paradigm widely used in multi-core systems,in which the parallel scheduling and access latency of main memory has a great influence on the performance of the program. To solve this problem,a task scheduling and cache optimization method was proposed for the stream program by combining the characteristics of X86 multi-core processors. To improve the parallel scheduling granularity and locality of the target program,expanded scheduling in the pre-processing phase was used firstly. Then the compiler exploitsed the data parallelism and task parallelism to keep load balancing and construct the corresponding software pipeline scheduling. According to cache hierarchies of the target system,the interference were reduced among parallel scheduling cores were reduced by eliminating cache false sharing,and the optimized mapping between logicthreads and physical cores in line with the inter-thread communication distribution were implemented. The compiler took a COStream dataflow program as input and outputs the optimized program. The common algorithms in media processing technology were chosen as the test programs for the experiment. The experimental result shows that the optimized test programs achieve linear speedup,which indicates the effectiveness of the compiling optimization system.
基金
国家高科技发展(863)计划(2012AA010902)
中国高等学校博士学科点专项基金(20120142110089)资助