Over the past decade, Graphics Processing Units (GPUs) have revolutionized high-performance computing, playing pivotal roles in advancing fields like IoT, autonomous vehicles, and exascale computing. Despite these adv...Over the past decade, Graphics Processing Units (GPUs) have revolutionized high-performance computing, playing pivotal roles in advancing fields like IoT, autonomous vehicles, and exascale computing. Despite these advancements, efficiently programming GPUs remains a daunting challenge, often relying on trial-and-error optimization methods. This paper introduces an optimization technique for CUDA programs through a novel Data Layout strategy, aimed at restructuring memory data arrangement to significantly enhance data access locality. Focusing on the dynamic programming algorithm for chained matrix multiplication—a critical operation across various domains including artificial intelligence (AI), high-performance computing (HPC), and the Internet of Things (IoT)—this technique facilitates more localized access. We specifically illustrate the importance of efficient matrix multiplication in these areas, underscoring the technique’s broader applicability and its potential to address some of the most pressing computational challenges in GPU-accelerated applications. Our findings reveal a remarkable reduction in memory consumption and a substantial 50% decrease in execution time for CUDA programs utilizing this technique, thereby setting a new benchmark for optimization in GPU computing.展开更多
Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file ...Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file systems: in-place update file system, log-structured file system, and copy-on-write file sys- tem. Each file system has its own strengths and weaknesses under different circumstances. We also include a recent us- age of persistent layout in a file system that combines both flash memory and byte- addressable non- volatile memory. With this survey, we conclude that persistent data layout in file systems may evolve dramatically in the era of emerging non-volatile memory.展开更多
以太原市主城区作为研究区域,基于兴趣点(Point of Interest, POI)数据,采用核密度分析、局部Getis-OrdGI~*指数等方法,对城市商业空间格局进行了研究,在其现有商业空间格局的基础上借助等值线树法对商业中心进行了识别,并对各类行业的...以太原市主城区作为研究区域,基于兴趣点(Point of Interest, POI)数据,采用核密度分析、局部Getis-OrdGI~*指数等方法,对城市商业空间格局进行了研究,在其现有商业空间格局的基础上借助等值线树法对商业中心进行了识别,并对各类行业的空间分布及集聚特征进行了对比分析。结果表明:太原市主城区商业空间已经形成了以柳巷、朝阳以及亲贤北街商圈为中心向外围扩散的分布格局,体现出汾河以东集中连片、汾河以西零星分散的分布特征;识别出太原市主城区内46个基本商业中心,其中以柳巷、朝阳—双塔、体育路—亲贤北街为主城区三大核心商业中心;各类行业空间聚集表现不同,生活服务类、购物服务类以及餐饮服务类行业分布范围广、集聚程度小,医疗保健类、商务服务类、金融服务类行业分布范围小、集聚程度高。展开更多
文摘Over the past decade, Graphics Processing Units (GPUs) have revolutionized high-performance computing, playing pivotal roles in advancing fields like IoT, autonomous vehicles, and exascale computing. Despite these advancements, efficiently programming GPUs remains a daunting challenge, often relying on trial-and-error optimization methods. This paper introduces an optimization technique for CUDA programs through a novel Data Layout strategy, aimed at restructuring memory data arrangement to significantly enhance data access locality. Focusing on the dynamic programming algorithm for chained matrix multiplication—a critical operation across various domains including artificial intelligence (AI), high-performance computing (HPC), and the Internet of Things (IoT)—this technique facilitates more localized access. We specifically illustrate the importance of efficient matrix multiplication in these areas, underscoring the technique’s broader applicability and its potential to address some of the most pressing computational challenges in GPU-accelerated applications. Our findings reveal a remarkable reduction in memory consumption and a substantial 50% decrease in execution time for CUDA programs utilizing this technique, thereby setting a new benchmark for optimization in GPU computing.
基金supported by ZTE Industry-Academia-Research Cooperation Funds
文摘Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file systems: in-place update file system, log-structured file system, and copy-on-write file sys- tem. Each file system has its own strengths and weaknesses under different circumstances. We also include a recent us- age of persistent layout in a file system that combines both flash memory and byte- addressable non- volatile memory. With this survey, we conclude that persistent data layout in file systems may evolve dramatically in the era of emerging non-volatile memory.
文摘以太原市主城区作为研究区域,基于兴趣点(Point of Interest, POI)数据,采用核密度分析、局部Getis-OrdGI~*指数等方法,对城市商业空间格局进行了研究,在其现有商业空间格局的基础上借助等值线树法对商业中心进行了识别,并对各类行业的空间分布及集聚特征进行了对比分析。结果表明:太原市主城区商业空间已经形成了以柳巷、朝阳以及亲贤北街商圈为中心向外围扩散的分布格局,体现出汾河以东集中连片、汾河以西零星分散的分布特征;识别出太原市主城区内46个基本商业中心,其中以柳巷、朝阳—双塔、体育路—亲贤北街为主城区三大核心商业中心;各类行业空间聚集表现不同,生活服务类、购物服务类以及餐饮服务类行业分布范围广、集聚程度小,医疗保健类、商务服务类、金融服务类行业分布范围小、集聚程度高。