期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Field-induced Néel vector bi-reorientation of a ferrimagnetic insulator in the vicinity of compensation temperature
1
作者 王鹏 赵辉 +3 位作者 栾仲智 夏思宇 丰韬 周礼繁 《Chinese Physics B》 SCIE EI CAS CSCD 2021年第2期481-486,共6页
The spin Hall magnetoresistance(SMR)effect in Pt/Gd_(3)Fe_(5)O_(12)(Gd IG)bilayers was systematically investigated.The sign of SMR changes twice with increasing magnetic field in the vicinity of the magnetization comp... The spin Hall magnetoresistance(SMR)effect in Pt/Gd_(3)Fe_(5)O_(12)(Gd IG)bilayers was systematically investigated.The sign of SMR changes twice with increasing magnetic field in the vicinity of the magnetization compensation point(TM)of Gd IG.However,conventional SMR theory predicts the invariant SMR sign in the heterostructure composed of a heavy metal film in contact with a ferromagnetic or antiferromagnetic film.We conclude that this is because of the significant enhancement of the magnetic moment of the Gd sub-lattice and the unchanged moment of the Fe sub-lattice with a relatively large field,meaning that a small net magnetic moment is induced at TM.As a result,the Néel vector aligns with the field after the spin-flop transition,meaning that a bi-reorientation of the Néel vector is produced.Theoretical calculations based on the Néel’s theory and SMR theory also support our conclusions.Our findings indicate that the Néel-vector direction of a ferrimagnet can be tuned across a wide range by a relatively low external field around TM. 展开更多
关键词 spin Hall magnetoresistance FERRIMAGNETS magnetic insulators magnetization switching
下载PDF
Towards optimized tensor code generation for deep learning on sunway many-core processor
2
作者 Mingzhen LI Changxi LIU +8 位作者 Jianjin LIAO Xuegui ZHENG Hailong YANG Rujun SUN Jun XU Lin GAN Guangwen YANG zhongzhi luan Depei QIAN 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第2期1-15,共15页
The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability.Among th... The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability.Among the existing deep learning compilers,TVM is well known for its efficiency in code generation and optimization across diverse hardware devices.In the meanwhile,the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific computing and deep learning workloads.This paper combines the trends in these two directions.Specifically,we propose swTVM that extends the original TVM to support ahead-of-time compilation for architecture requiring cross-compilation such as Sunway.In addition,we leverage the architecture features during the compilation such as core group for massive parallelism,DMA for high bandwidth memory transfer and local device memory for data locality,in order to generate efficient codes for deep learning workloads on Sunway.The experiment results show that the codes generated by swTVM achieve 1.79x improvement of inference latency on average compared to the state-of-the-art deep learning framework on Sunway,across eight representative benchmarks.This work is the first attempt from the compiler perspective to bridge the gap of deep learning and Sunway processor particularly with productivity and efficiency in mind.We believe this work will encourage more people to embrace the power of deep learning and Sunwaymany-coreprocessor. 展开更多
关键词 sunway processor deep learning compiler code generation performance optimization
原文传递
swSpAMM:optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight
3
作者 Xiaoyan LIU Yi LIU +3 位作者 Bohong YIN Hailong YANG zhongzhi luan Depei QIAN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第4期29-41,共13页
Although matrix multiplication plays an essential role in a wide range of applications,previous works only focus on optimizing dense or sparse matrix multiplications.The Sparse Approximate Matrix Multiply(SpAMM)is an ... Although matrix multiplication plays an essential role in a wide range of applications,previous works only focus on optimizing dense or sparse matrix multiplications.The Sparse Approximate Matrix Multiply(SpAMM)is an algorithm to accelerate the multiplication of decay matrices,the sparsity of which is between dense and sparse matrices.In addition,large-scale decay matrix multiplication is performed in scientific applications to solve cutting-edge problems.To optimize large-scale decay matrix multiplication using SpAMM on supercomputers such as Sunway Taihulight,we present swSpAMM,an optimized SpAMM algorithm by adapting the computation characteristics to the architecture features of Sunway Taihulight.Specifically,we propose both intra-node and inter-node optimizations to accelerate swSpAMM for large-scale execution.For intra-node optimizations,we explore algorithm parallelization and block-major data layout that are tailored to better utilize the architecture advantage of Sunway processor.For inter-node optimizations,we propose a matrix organization strategy for better distributing sub-matrices across nodes and a dynamic scheduling strategy for improving load balance across nodes.We compare swSpAMM with the existing GEMM library on a single node as well as large-scale matrix multiplication methods on multiple nodes.The experiment results show that swSpAMM achieves a speedup up to 14.5×and 2.2×when compared to xMath library on a single node and 2D GEMM method on multiple nodes,respectively. 展开更多
关键词 approximate calculation sunway processor performance optimization
原文传递
DPHL:A DIA Pan-human Protein Mass Spectrometry Library for Robust Biomarker Discovery 被引量:5
4
作者 Tiansheng Zhu Yi Zhu +73 位作者 Yue Xuan Huanhuan Gao Xue Cai Sander R.Piersma Thang V.Pham Tim Schelfhorst Richard R.G.D.Haas Irene V.Bijnsdorp Rui Sun Liang Yue Guan Ruan Qiushi Zhang Mo Hu Yue Zhou Winan J.Van Houdt Tessa Y.S.Le Large Jacqueline Cloos Anna Wojtuszkiewicz Danijela Koppers-Lalic Franziska Bottger Chantal Scheepbouwer Ruud H.Brakenhoff Geert J.L.H.van Leenders Jan N.M.Ijzermans John W.M.Martens Renske D.M.Steenbergen Nicole C.Grieken Sathiyamoorthy Selvarajan Sangeeta Mantoo Sze S.Lee Serene J.Y.Yeow Syed M.F.Alkaff Nan Xiang Yaoting Sun Xiao Yi Shaozheng Dai Wei Liu Tian Lu Zhicheng Wu Xiao Liang Man Wang Yingkuan Shao Xi Zheng Kailun Xu Qin Yang Yifan Meng Cong Lu Jiang Zhu Jin'e Zheng Bo Wang Sai Lou Yibei Dai Chao Xu Chenhuan Yu Huazhong Ying Tony K.Lim Jianmin Wu Xiaofei Gao zhongzhi luan Xiaodong Teng Peng Wu Shi'ang Huang Zhihua Tao Narayanan G.Iyer Shuigeng Zhou Wenguang Shao Henry Lam Ding Ma Jiafu Ji Oi L.Kon Shu Zheng Ruedi Aebersold Connie R.Jimenez Tiannan Guo 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2020年第2期104-119,共16页
To address the increasing need for detecting and validating protein biomarkers in clinical specimens,mass spectrometry(MS)-based targeted proteomic techniques,including the selected reaction monitoring(SRM),parallel r... To address the increasing need for detecting and validating protein biomarkers in clinical specimens,mass spectrometry(MS)-based targeted proteomic techniques,including the selected reaction monitoring(SRM),parallel reaction monitoring(PRM),and massively parallel dataindependent acquisition(DIA),have been developed.For optimal performance,they require the fragment ion spectra of targeted peptides as prior knowledge.In this report,we describe a MS pipeline and spectral resource to support targeted proteomics studies for human tissue samples.To build the spectral resource,we integrated common open-source MS computational tools to assemble a freely accessible computational workflow based on Docker.We then applied the workflow to generate DPHL,a comprehensive DIA pan-human library,from 1096 data-dependent acquisition(DDA)MS raw files for 16 types of cancer samples.This extensive spectral resource was then applied to a proteomic study of 17 prostate cancer(PCa)patients.Thereafter,PRM validation was applied to a larger study of 57 PCa patients and the differential expression of three proteins in prostate tumor was validated.As a second application,the DPHL spectral resource was applied to a study consisting of plasma samples from 19 diffuse large B cell lymphoma(DLBCL)patients and 18 healthy control subjects.Differentially expressed proteins between DLBCL patients and healthy control subjects were detected by DIA-MS and confirmed by PRM.These data demonstrate that the DPHL supports DIA and PRM MS pipelines for robust protein biomarker discovery.DPHL is freely accessible at https://www.iprox.org/page/project.html?id=IPX0001400000. 展开更多
关键词 Data-independent acquisition Parallel reaction monitoring Spectral library Prostate cancer Diffuse large B cell lymphoma
原文传递
Coordinating workload balancing and power switching in renewable energy powered data center 被引量:1
5
作者 Xian LI Rui WANG +2 位作者 zhongzhi luan Yi LIU Depei QIAN 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第3期574-587,共14页
There has been growing concern about energy consumption and environmental impact of datacenters. Some pioneers begin to power datacenters with renewable energy to offset carbon footprint. However, it is challenging to... There has been growing concern about energy consumption and environmental impact of datacenters. Some pioneers begin to power datacenters with renewable energy to offset carbon footprint. However, it is challenging to integrate intermittent renewable energy into datacenter power system. Grid-tied system is widely deployed in renewable energy powered datacenters. But the drawbacks (e.g. Harmonic dis- turbance and costliness) of grid tie inverter harass this design. Besides, the mixture of green load and brown load makes power management heavily depend on software measurement and monitoring, which often suffers inaccuracy. We propose DualPower, a novel power provisioning architecture that en- ables green datacenters to integrate renewable power supply without grid tie inverters. To optimize DualPower operation, we propose a specially designed power management frame- work to coordinate workload balancing with power supply switching. We evaluate three optimization schemes (LM, PS and JO) under different datacenter operation scenarios on our trace-driven simulation platform. The experimental results show that DualPower can be as efficient as grid-tied system and has good scalability. In contrast to previous works, Du- alPower integrates renewable power at lower cost and main- tains full availability of datacenter servers. 展开更多
关键词 renewable energy green computing power pro-visioning power management
原文传递
A novel index system describing program runtime characteristics for workload consolidation
6
作者 Lin WANG Depei QIAN +3 位作者 Rui WANG zhongzhi luan Hailong YANG Huaxiang ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第3期489-499,共11页
Workload consolidation is a common method to improve the resource utilization in clusters or data centers. In order to achieve efficient workload consolidation, the runtime characteristics of a program should be taken... Workload consolidation is a common method to improve the resource utilization in clusters or data centers. In order to achieve efficient workload consolidation, the runtime characteristics of a program should be taken into con-sideration in scheduling. In this paper, we propose a novel index system for efficiently describing the program runtime characteristics. With the help of this index system, programs can be classified by the following runtime characteristics: 1) dependence to multi-dimensional resources including CPU, disk I/O, memory and network I/O;and 2) impact and vulnerability to resource sharing embodied by resource usage and resource sensitivity. In order to verify the effectiveness of this novel index system in workload consolidation, a scheduling strategy, Sche-index, using the new index system for workload consolidation is proposed. Experiment results show that compared with traditional least-loaded scheduling strategy, Sche-index can improve both program performance and system resource utilization significantly. 展开更多
关键词 index system RUNTIME CHARACTERISTICS WORKLOAD CONSOLIDATION CLUSTER SCHEDULING
原文传递
Accelerating the cryo-EM structure determination in RELION on GPU cluster
7
作者 Xin YOU Hailong YANG +1 位作者 zhongzhi luan Depei QIAN 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第3期21-39,共19页
The cryo-electron microscopy(cryo-EM)is one of the most powerful technologies available today for structural biology.The RELION(Regularized Likelihood Optimization)implements a Bayesian algorithm for cryo-EM structure... The cryo-electron microscopy(cryo-EM)is one of the most powerful technologies available today for structural biology.The RELION(Regularized Likelihood Optimization)implements a Bayesian algorithm for cryo-EM structure determination,which is one of the most widely used software in this field.Many researchers have devoted effort to improve the performance of RELION to satisfy the analysis for the ever-increasing volume of datasets.In this paper,we focus on performance analysis of the most time-consuming computation steps in RELION and identify their performance bottlenecks for specific optimizations.We propose several performance optimization strategies to improve the overall performance of RELION,including optimization of expectation step,parallelization of maximization step,accelerating the computation of symmetries,and memory affinity optimization.The experiment results show that our proposed optimizations achieve significant speedups of RELION across representative datasets.In addition,we perform roofline model analysis to understand the effectiveness of our optimizations. 展开更多
关键词 cryo-EM structure determination performance optimization GPU acceleration RELION
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部