Electroencephalography(EEG)analysis extracts critical information from brain signals,enabling brain disease diagnosis and providing fundamental support for brain–computer interfaces.However,performing an artificial i...Electroencephalography(EEG)analysis extracts critical information from brain signals,enabling brain disease diagnosis and providing fundamental support for brain–computer interfaces.However,performing an artificial intelligence analysis of EEG signals with high energy efficiency poses significant challenges for electronic processors on edge computing devices,especially with large neural network models.Herein,we propose an EEG opto-processor based on diffractive photonic computing units(DPUs)to process extracranial and intracranial EEG signals effectively and to detect epileptic seizures.The signals of the EEG channels within a second-time window are optically encoded as inputs to the constructed diffractive neural networks for classification,which monitors the brain state to identify symptoms of an epileptic seizure.We developed both free-space and integrated DPUs as edge computing systems and demonstrated their applications for real-time epileptic seizure detection using benchmark datasets,that is,the Children’s Hospital Boston(CHB)–Massachusetts Institute of Technology(MIT)extracranial and Epilepsy-iEEG-Multicenter intracranial EEG datasets,with excellent computing performance results.Along with the channel selection mechanism,both numerical evaluations and experimental results validated the sufficiently high classification accuracies of the proposed opto-processors for supervising clinical diagnosis.Our study opens a new research direction for utilizing photonic computing techniques to process large-scale EEG signals and promote broader applications.展开更多
A non-photorealistic rendering technique is a method to show various effects different from those of realistic image generation.Of the various techniques,flow-based image abstraction displays the shape and color featu...A non-photorealistic rendering technique is a method to show various effects different from those of realistic image generation.Of the various techniques,flow-based image abstraction displays the shape and color features well and performs a stylistic visual abstraction.But real-time rendering is impossible when CPU is used because it applies various filtering and iteration methods.In this paper,we present real-time processing methods of video abstraction using open open computing language(OpenCL),technique of general-purpose computing on graphics processing units(GPGPU).Through the acceleration of general-purpose computing(GPU),16 frame-per-second(FPS)or greater is shown to process video abstraction.展开更多
A microtubule gliding assay is a biological experiment observing the dynamics of microtubules driven by motor proteins fixed on a glass surface. When appropriate microtubule interactions are set up on gliding assay ex...A microtubule gliding assay is a biological experiment observing the dynamics of microtubules driven by motor proteins fixed on a glass surface. When appropriate microtubule interactions are set up on gliding assay experiments, microtubules often organize and create higher-level dynamics such as ring and bundle structures. In order to reproduce such higher-level dynamics on computers, we have been focusing on making a real-time 3D microtubule simulation. This real-time 3D microtubule simulation enables us to gain more knowledge on microtubule dynamics and their swarm movements by means of adjusting simulation paranleters in a real-time fashion. One of the technical challenges when creating a real-time 3D simulation is balancing the 3D rendering and the computing performance. Graphics processor unit (GPU) programming plays an essential role in balancing the millions of tasks, and makes this real-time 3D simulation possible. By the use of general-purpose computing on graphics processing units (GPGPU) programming we are able to run the simulation in a massively parallel fashion, even when dealing with more complex interactions between microtubules such as overriding and snuggling. Due to performance being an important factor, a performance n, odel has also been constructed from the analysis of the microtubule simulation and it is consistent with the performance measurements on different GPGPU architectures with regards to the number of cores and clock cycles.展开更多
A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Reg...A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.展开更多
The Moving Particle Semi-implicit (MPS) method performs well in simulating violent free surface flow and hence becomes popular in the area of fluid flow simulation. However, the implementations of searching neighbouri...The Moving Particle Semi-implicit (MPS) method performs well in simulating violent free surface flow and hence becomes popular in the area of fluid flow simulation. However, the implementations of searching neighbouring particles and solving the large sparse matrix equations (Poisson-type equation) are very time-consuming. In order to utilize the tremendous power of parallel computation of Graphics Processing Units (GPU), this study has developed a GPU-based MPS model employing the Compute Unified Device Architecture (CUDA) on NVIDIA GTX 280. The efficient neighbourhood particle searching is done through an indirect method and the Poisson-type pressure equation is solved by the Bi-Conjugate Gradient (BiCG) method. Four different optimization levels for the present general parallel GPU-based MPS model are demonstrated. In addition, the elaborate optimization of GPU code is also discussed. A benchmark problem of dam-breaking flow is simulated using both codes of the present GPU-based MPS and the original CPU-based MPS. The comparisons between them show that the GPU-based MPS model outperforms 26 times the traditional CPU model.展开更多
基金supported by the National Major Science and Technology Projects of China(2021ZD0109902 and 2020AA0105500)the National Natural Science Fundation of China(62275139 and 62088102)the Tsinghua University Initiative Scientific Research Program.
文摘Electroencephalography(EEG)analysis extracts critical information from brain signals,enabling brain disease diagnosis and providing fundamental support for brain–computer interfaces.However,performing an artificial intelligence analysis of EEG signals with high energy efficiency poses significant challenges for electronic processors on edge computing devices,especially with large neural network models.Herein,we propose an EEG opto-processor based on diffractive photonic computing units(DPUs)to process extracranial and intracranial EEG signals effectively and to detect epileptic seizures.The signals of the EEG channels within a second-time window are optically encoded as inputs to the constructed diffractive neural networks for classification,which monitors the brain state to identify symptoms of an epileptic seizure.We developed both free-space and integrated DPUs as edge computing systems and demonstrated their applications for real-time epileptic seizure detection using benchmark datasets,that is,the Children’s Hospital Boston(CHB)–Massachusetts Institute of Technology(MIT)extracranial and Epilepsy-iEEG-Multicenter intracranial EEG datasets,with excellent computing performance results.Along with the channel selection mechanism,both numerical evaluations and experimental results validated the sufficiently high classification accuracies of the proposed opto-processors for supervising clinical diagnosis.Our study opens a new research direction for utilizing photonic computing techniques to process large-scale EEG signals and promote broader applications.
文摘A non-photorealistic rendering technique is a method to show various effects different from those of realistic image generation.Of the various techniques,flow-based image abstraction displays the shape and color features well and performs a stylistic visual abstraction.But real-time rendering is impossible when CPU is used because it applies various filtering and iteration methods.In this paper,we present real-time processing methods of video abstraction using open open computing language(OpenCL),technique of general-purpose computing on graphics processing units(GPGPU).Through the acceleration of general-purpose computing(GPU),16 frame-per-second(FPS)or greater is shown to process video abstraction.
基金supported by a Grant-in-Aid for Scientific Research on Innovation Areas "Molecular Robotics"(No.24104004) of the Ministry of Education,Culture,Sports,Science,and Technology,Japan
文摘A microtubule gliding assay is a biological experiment observing the dynamics of microtubules driven by motor proteins fixed on a glass surface. When appropriate microtubule interactions are set up on gliding assay experiments, microtubules often organize and create higher-level dynamics such as ring and bundle structures. In order to reproduce such higher-level dynamics on computers, we have been focusing on making a real-time 3D microtubule simulation. This real-time 3D microtubule simulation enables us to gain more knowledge on microtubule dynamics and their swarm movements by means of adjusting simulation paranleters in a real-time fashion. One of the technical challenges when creating a real-time 3D simulation is balancing the 3D rendering and the computing performance. Graphics processor unit (GPU) programming plays an essential role in balancing the millions of tasks, and makes this real-time 3D simulation possible. By the use of general-purpose computing on graphics processing units (GPGPU) programming we are able to run the simulation in a massively parallel fashion, even when dealing with more complex interactions between microtubules such as overriding and snuggling. Due to performance being an important factor, a performance n, odel has also been constructed from the analysis of the microtubule simulation and it is consistent with the performance measurements on different GPGPU architectures with regards to the number of cores and clock cycles.
基金supported by the decision support project of response to climate change of China,the National Natural Science Foundation of China (Nos.41674085, 41604009, and 41621091)the Natural Science Foundation of Qinghai Province (No. 2019-ZJ-7034)the Open Project of State Key Laboratory of Plateau Ecology and Agriculture,Qinghai University (No. 2020-zz-03)。
文摘A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.
基金supported by the National Natural Science Foundation of China with Grant No. 10772040, 50921001 and 50909016The financial support from the Important National Science & Technology Specific Projects of China with Grant No. 2008ZX05026-02 is also appreciated
文摘The Moving Particle Semi-implicit (MPS) method performs well in simulating violent free surface flow and hence becomes popular in the area of fluid flow simulation. However, the implementations of searching neighbouring particles and solving the large sparse matrix equations (Poisson-type equation) are very time-consuming. In order to utilize the tremendous power of parallel computation of Graphics Processing Units (GPU), this study has developed a GPU-based MPS model employing the Compute Unified Device Architecture (CUDA) on NVIDIA GTX 280. The efficient neighbourhood particle searching is done through an indirect method and the Poisson-type pressure equation is solved by the Bi-Conjugate Gradient (BiCG) method. Four different optimization levels for the present general parallel GPU-based MPS model are demonstrated. In addition, the elaborate optimization of GPU code is also discussed. A benchmark problem of dam-breaking flow is simulated using both codes of the present GPU-based MPS and the original CPU-based MPS. The comparisons between them show that the GPU-based MPS model outperforms 26 times the traditional CPU model.