Fractional-slot concentrated-coil electric machines are often used in those applications where a number of rotor poles close to the number of stator slots is required. A major criticality of such machines is the occur...Fractional-slot concentrated-coil electric machines are often used in those applications where a number of rotor poles close to the number of stator slots is required. A major criticality of such machines is the occurrence of large air-gap field harmonics due to winding distribution and to slotting effects. Predicting such harmonics analytically with adequate accuracy is a good way to significantly speed-up subsequent investigations, concerning the rotor effects of the field harmonics in terms of rotor losses. This paper proposes different analytical formulations for this purpose, covering the case of a generic number of stator phases and differing by how slotting effects are taken into account. The various approaches proposed are evaluated by comparing analytical results with finite-element analysis computations on a sample machine geometries.展开更多
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunit...The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology -- together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.展开更多
Simulation is an important method to evaluate future computer systems. Currently microprocessor architecture has switched to parallel, but almost all simulators remained at sequential stage, and the advantages brought...Simulation is an important method to evaluate future computer systems. Currently microprocessor architecture has switched to parallel, but almost all simulators remained at sequential stage, and the advantages brought by multi-core or many-core processors cannot be utilized. This paper presents a parallel simulator engine (SimK) towards the prevalent SMP/CMP platform, aiming at large-scale fine-grained computer system simulation. In this paper, highly efficient synchronization, communication and buffer management policies used in SimK are introduced, and a novel lock-free scheduling mechanism that avoids using any atomic instructions is presented. To deal with the load fluctuation at light load case, a cooperated dynamic task migration scheme is proposed. Based on SimK, we have developed large-scale parallel simulators HppSim and HppNetSim, which simulate a full supercomputer system and its interconnection network respectively. Results show that HppSim and HppNetSim both gain sound speedup with multiple processors, and the best normalized speedup reaches 14.95X on a two-way quad-core server.展开更多
文摘Fractional-slot concentrated-coil electric machines are often used in those applications where a number of rotor poles close to the number of stator slots is required. A major criticality of such machines is the occurrence of large air-gap field harmonics due to winding distribution and to slotting effects. Predicting such harmonics analytically with adequate accuracy is a good way to significantly speed-up subsequent investigations, concerning the rotor effects of the field harmonics in terms of rotor losses. This paper proposes different analytical formulations for this purpose, covering the case of a generic number of stator phases and differing by how slotting effects are taken into account. The various approaches proposed are evaluated by comparing analytical results with finite-element analysis computations on a sample machine geometries.
基金Supported by the National Basic Research 973 Program of China under Grant No.2005CB321602the National Natural Science Foundation of China under Grant No.60736012the National High Technology Research and Development 863 Program of China under Grant Nos.2007AA01Z110 and 2009AA01Z103
文摘The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology -- together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.
基金Supported by the National Natural Science Foundation of China under Grant No. 60633040the National High Technology Research and Development 863 Program of China under Grant Nos. 2006AA01A102 and 2007AA01Z115
文摘Simulation is an important method to evaluate future computer systems. Currently microprocessor architecture has switched to parallel, but almost all simulators remained at sequential stage, and the advantages brought by multi-core or many-core processors cannot be utilized. This paper presents a parallel simulator engine (SimK) towards the prevalent SMP/CMP platform, aiming at large-scale fine-grained computer system simulation. In this paper, highly efficient synchronization, communication and buffer management policies used in SimK are introduced, and a novel lock-free scheduling mechanism that avoids using any atomic instructions is presented. To deal with the load fluctuation at light load case, a cooperated dynamic task migration scheme is proposed. Based on SimK, we have developed large-scale parallel simulators HppSim and HppNetSim, which simulate a full supercomputer system and its interconnection network respectively. Results show that HppSim and HppNetSim both gain sound speedup with multiple processors, and the best normalized speedup reaches 14.95X on a two-way quad-core server.