期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Towards optimized tensor code generation for deep learning on sunway many-core processor
1
作者 Mingzhen LI Changxi LIU +8 位作者 Jianjin LIAO Xuegui ZHENG Hailong YANG Rujun SUN Jun XU Lin GAN Guangwen YANG Zhongzhi LUAN Depei QIAN 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第2期1-15,共15页
The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability.Among th... The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability.Among the existing deep learning compilers,TVM is well known for its efficiency in code generation and optimization across diverse hardware devices.In the meanwhile,the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific computing and deep learning workloads.This paper combines the trends in these two directions.Specifically,we propose swTVM that extends the original TVM to support ahead-of-time compilation for architecture requiring cross-compilation such as Sunway.In addition,we leverage the architecture features during the compilation such as core group for massive parallelism,DMA for high bandwidth memory transfer and local device memory for data locality,in order to generate efficient codes for deep learning workloads on Sunway.The experiment results show that the codes generated by swTVM achieve 1.79x improvement of inference latency on average compared to the state-of-the-art deep learning framework on Sunway,across eight representative benchmarks.This work is the first attempt from the compiler perspective to bridge the gap of deep learning and Sunway processor particularly with productivity and efficiency in mind.We believe this work will encourage more people to embrace the power of deep learning and Sunwaymany-coreprocessor. 展开更多
关键词 sunway processor deep learning compiler code generation performance optimization
原文传递
Multi-threaded code generation from Signal program to OpenMP 被引量:2
2
作者 Kai HU Teng ZHANG Zhibin YANG 《Frontiers of Computer Science》 SCIE EI CSCD 2013年第5期617-626,共10页
The use of multi-core processors will become a trend in safety critical systems. For safe execution of multi- threaded code, automatic code generation from formal spec- ification is a desirable method. Signal, a synch... The use of multi-core processors will become a trend in safety critical systems. For safe execution of multi- threaded code, automatic code generation from formal spec- ification is a desirable method. Signal, a synchronous lan- guage dedicated for the functional description of safety crit- ical systems, provides soundness semantics for determinis- tic concurrency. Although sequential code generation of Sig- nal has been implemented in Polychrony compiler, deter- ministic multi-threaded code generation strategy is still far from mature. Moreover, existing code generation methods use certain multi-thread library, which limits the cross plat- form executions. OpenMP is an application program inter- face (API) standard for parallel programming, supported by several mainstream compilers from different platforms. This paper presents a methodology translating Signal program to OpenMP-based multi-threaded C code. First, the intermedi- ate representation of the core syntax of Signal using syn- chronous guarded actions is defined. Then, according to the compositional semantics of Signal equations, the Signal pro- gram is synthesized to dependency graph (DG). After par- allel tasks are extracted from dependency graph, the Signal program can be finally translated into OpenMP-based C code which can be executed on multiple platforms. 展开更多
关键词 MULTI-THREAD synchronous language Signal code generation OPENMP
原文传递
Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs
3
作者 Wen-Jing Ma Kan Gao Guo-Ping Long 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第6期1262-1274,共13页
Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interpl... Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interplay of compu- ration reuse and hardware specifics on application performance. In this paper, we propose an automatic code generator for a class of stencil codes with inherent computation reuse on CPUs. For such applications, the proper reuse of intermediate results, combined with careful register and on-chip local memory usage, has profound implications on performance. Current state of the art does not address this problem in depth, partially due to the lack of a good program representation that can expose all potential computation reuse. In this paper, we leverage the computation overlap graph (COG), a simple representation of data dependence and data reuse with "element view", to expose potential reuse opportunities. Using COG, we propose a portable code generation and tuning framework for GPUs. Compared with current state-of-the-art code generators, our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050. 展开更多
关键词 GPGPU OPENCL STENCIL code generation computation reuse
原文传递
Code Generation Framework for Grid Development
4
作者 JIANG Ling-yun WANG Ru-chuan WANG Hai-yan 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2006年第2期39-42,共4页
While grid computing receives more and more attentions, it is not widely used partly due to requirement of sophisticated development. This paper discusses a code generation framework for grid computing. We firstly int... While grid computing receives more and more attentions, it is not widely used partly due to requirement of sophisticated development. This paper discusses a code generation framework for grid computing. We firstly introduce GBuilder as a rapid development tool for building grid computing applications, then present the details of the code generation framework. We then discuss a case study to show the advantages of the whole processing of code generation framework, which including saved development time and less intricacy burden on the grid application developers. 展开更多
关键词 code generation GRID globus toolkit
原文传递
Developing Projects with Low-code Combined with ChatGPT
5
作者 Wei Xiong Bing Li +1 位作者 Zhao Wu Bo Hang 《计算机教育》 2023年第12期204-213,共10页
This paper aims to explore a simpler and more user-friendly way of generating software based on model-driven development.Previous studies have attempted to generate code from domain models,hoping to reduce coding time... This paper aims to explore a simpler and more user-friendly way of generating software based on model-driven development.Previous studies have attempted to generate code from domain models,hoping to reduce coding time by increasing modeling time.However,as code tools become more advanced,it is challenging to improve efficiency because models are abstract while implementations are concrete.This paper proposes a novel approach that integrates ChatGPT as a plug-in into the whole R&D process and combines it with our code generation tool to enhance R&D efficiency.We have developed some demos to demonstrate the effectiveness of our approach.According to our evaluation,our approach can save more than 90%of the work in implementing the code generation tool,leaving only about 10%of the work for code review,code improvement,and unit testing. 展开更多
关键词 ChatGPT Low-code code generation Software engineering Project development
下载PDF
Complementary Functions and Potential of Generative AI in Algorithm Education
6
作者 Guoqiang Li Yuxin Su 《计算机教育》 2023年第12期221-231,共11页
In today’s digital era,algorithms have become an indispensable part of our daily lives and work.Algorithm education plays a crucial role in computer science and software engineering,aiming to cultivate students’prob... In today’s digital era,algorithms have become an indispensable part of our daily lives and work.Algorithm education plays a crucial role in computer science and software engineering,aiming to cultivate students’problem-solving skills and computational thinking.However,traditional algorithm education often requires significant time and efforts from teachers,lacks interactivity,and provides limited examples.The rapid advancement of AI technology,particularly generative models,and large language models(LLMs),has the potential to revolutionize computer education.Models like OpenAI’s GPT-4 and ChatGPT have conversational capabilities and contribute to various aspects of computer education.GPT-3.5,as an assistant in algorithm education,assists teachers in automatically generating explanations and algorithmic examples to enhance students’understanding of algorithms.While existing research has certain limitations,such as focusing on specific scenarios and lacking comprehensive benchmark testing,this paper explores the role of ChatGPT(GPT-3.5)in algorithm education.By refining prompts and evaluating generative capabilities,the study demonstrates that GPT-3.5 holds significant potential as a teaching aid.With an average accuracy of 0.81.GPT-3.5 can generate explanations,code examples,and visualizations of the corresponding algorithms.Other tests including algorithm problem-solving and examples giving also prove the practicability of GPT-3.5 in algorithm education. 展开更多
关键词 ChatGPT Large language models code generation AI education Computing education Algorithm education
下载PDF
Universal Tracing Interface for Multicore Processors
7
作者 Janne Vatjus-Anttila Mika Hoppari +2 位作者 Lance Fono Kari Kolehmainen Subayal Khan 《Journal of Computer and Communications》 2016年第1期1-11,共11页
Application developers of today need to produce code which is error-free, and whose performance is optimized for plethora of devices. Performance of application code is studied e.g. by analyzing performance data obtai... Application developers of today need to produce code which is error-free, and whose performance is optimized for plethora of devices. Performance of application code is studied e.g. by analyzing performance data obtained by executing application with tracing tool. Developers typically have their favorite tools which they prefer to use but unfortunately target devices are based on different computing platforms that have different performance probes which cause difficulties for using same tool with different multicore platforms. Universal Tracing Interface for Multicore Processors (UTIMP) aims to provide an unchangeable tracing interface enabling developers to perform required tracing tasks with the UTIMP, utilizing the favorite tool when possible, for different multicore platforms. 展开更多
关键词 TRACING PERFORMANCE PLATFORM PROBE Toolchain code generation
下载PDF
Resolution Characteristics of GaAs/GaAlAs Transmission Photocathode
8
作者 YAN Jin-liang,ZHAO Yin-nu,ZHU Chang-chun (School of Electron. & Inform.Eng.,Xi’an Jiaotong University,Xi’an 710049,CHN) 《Semiconductor Photonics and Technology》 CAS 1999年第2期96-100,共5页
The resolution characteristic of GaAs/GaAlAs transmission photocathode is an important parameter in third generation intensifiers. The modulation transfer function of GaAs/GaAlAs transmission photo... The resolution characteristic of GaAs/GaAlAs transmission photocathode is an important parameter in third generation intensifiers. The modulation transfer function of GaAs/GaAlAs transmission photocathode is derived from a simple two-dimensional diffusion equation. The theoretical resolution characteristic of a 2 μm thick GaAs/GaAlAs transmission photocathode is calculated. The relationship between resolution and parameters in GaAs/GaAlAs transmission photocathode is discussed. A conclusion is shown that one can design the GaAs/GaAlAs transmission photocathode for maximum quantum efficiency, since the sacrifice in the resolution doesn't limit system performances. 展开更多
关键词 GaAs/GaAlAs Photocathode Quantum Yield RESOLUTION Third generation Intensifier CLC number:TN383.4 Document code:A
下载PDF
Advanced ECU Software Development Method for Fuel Cell Systems 被引量:3
9
作者 田硕 刘原 +2 位作者 夏文川 李建秋 欧阳明高 《Tsinghua Science and Technology》 SCIE EI CAS 2005年第5期610-617,共8页
The electronic control unit (ECU) in electrical powered hybrid and fuel cell vehicles is exceedingly complex. Rapid prototyping control is used to reduce development time and eliminate errors during software develop... The electronic control unit (ECU) in electrical powered hybrid and fuel cell vehicles is exceedingly complex. Rapid prototyping control is used to reduce development time and eliminate errors during software development. This paper describes a high-efficiency development method and a flexible tool chain suitable for various applications in automotive engineering. The control algorithm can be deployed directly from a Matlab/Simulink/Stateflow environment into the ECU hardware together with an OSEK real-time operating system (RTOS). The system has been successfully used to develop a 20-kW fuel cell system ECU based on a Motorola PowerPC 555 (MPC555) microcontroller. The total software development time is greatly reduced and the code quality and reliability are greatly enhanced. 展开更多
关键词 automotive engineering fuel cell electronic controller unit (ECU) embedded software development rapid prototyping automatic code generation SIMULATION OSEK
原文传递
Improving on Linear Scan Register Allocation 被引量:1
10
作者 Shahrzad Kananizadeh Kirill Kononenko 《International Journal of Automation and computing》 EI CSCD 2018年第2期228-238,共11页
Register allocation is a major step for all compilers. Various register allocation algorithms have been developed over the dec- ades. This work describes a new class of rapid register allocation algorithms and present... Register allocation is a major step for all compilers. Various register allocation algorithms have been developed over the dec- ades. This work describes a new class of rapid register allocation algorithms and presents experimental data on their behavior. Our re- search encourages the avoidance of graphing and graph-coloring based on the fact that precise graph-coloring is nondeterministic poly- nomial time-complete (NP-complete), which is not suitable for real-time tasks. In addition, practical graph-coloring algorithms tend to use polynomial-time heuristics. In dynamic compilation environments, their super linear complexity makes them unsuitable for register allocation and code generation. Existing tools for code generation and register allocation do not completely fulfill the requirements of fast compilation. Existing approaches either do not allow for the optimization of register allocation to be achieved comprehensively with a sufficient degree of performance or they require an unjustifiable amount of time and/or resources. Therefore, we propose a new class of register allocation and code generation algorithms that can be performed in linear time. These algorithms are based on the mathematic- al foundations of abstract interpretation and the computation of the level of abstraction. They have been implemented in a specialized library for just-in-time compilation. The specialization of this library involves the execution of common intermediate language (CIL) and low level virtual machine (LLVM) with a focus on embedded systems. 展开更多
关键词 Register allocation just-in-time compilation code generation static analysis dynamic analysis.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部