By pushing computation,cache,and network control to the edge,mobile edge computing(MEC)is expected to play a leading role in fifth generation(5G)and future sixth generation(6G).Nevertheless,facing ubiquitous fast-grow...By pushing computation,cache,and network control to the edge,mobile edge computing(MEC)is expected to play a leading role in fifth generation(5G)and future sixth generation(6G).Nevertheless,facing ubiquitous fast-growing computational demands,it is impossible for a single MEC paradigm to effectively support high-quality intelligent services at end user equipments(UEs).To address this issue,we propose an air-ground collaborative MEC(AGCMEC)architecture in this article.The proposed AGCMEC integrates all potentially available MEC servers within air and ground in the envisioned 6G,by a variety of collaborative ways to provide computation services at their best for UEs.Firstly,we introduce the AGC-MEC architecture and elaborate three typical use cases.Then,we discuss four main challenges in the AGC-MEC as well as their potential solutions.Next,we conduct a case study of collaborative service placement for AGC-MEC to validate the effectiveness of the proposed collaborative service placement strategy.Finally,we highlight several potential research directions of the AGC-MEC.展开更多
The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising altern...The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising alternative architecture,enabling computing operations within memory arrays to overcome these limitations.Memristive devices have gained significant attention as key components for in-memory computing due to their high-density arrays,rapid response times,and ability to emulate biological synapses.Among these devices,two-dimensional(2D)material-based memristor and memtransistor arrays have emerged as particularly promising candidates for next-generation in-memory computing,thanks to their exceptional performance driven by the unique properties of 2D materials,such as layered structures,mechanical flexibility,and the capability to form heterojunctions.This review delves into the state-of-the-art research on 2D material-based memristive arrays,encompassing critical aspects such as material selection,device perfor-mance metrics,array structures,and potential applications.Furthermore,it provides a comprehensive overview of the current challenges and limitations associated with these arrays,along with potential solutions.The primary objective of this review is to serve as a significant milestone in realizing next-generation in-memory computing utilizing 2D materials and bridge the gap from single-device characterization to array-level and system-level implementations of neuromorphic computing,leveraging the potential of 2D material-based memristive devices.展开更多
As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academ...As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academia and industry.However,dynamic reconfigurable computing is not yet mature because of several unsolved problems.This work introduces the concept,architecture,and compilation techniques of dynamic reconfigurable computing.It also discusses the existing major challenges and points out its potential applications.展开更多
The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is present...The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.展开更多
Security is a key problem for the development of Cloud Computing. A common service security architecture is a basic abstract to support security research work. The authorization ability in the service security faces m...Security is a key problem for the development of Cloud Computing. A common service security architecture is a basic abstract to support security research work. The authorization ability in the service security faces more complex and variable users and environment. Based on the multidimensional views, the service security architecture is described on three dimensions of service security requirement integrating security attributes and service layers. An attribute-based dynamic access control model is presented to detail the relationships among subjects, objects, roles, attributes, context and extra factors further. The model uses dynamic control policies to support the multiple roles and flexible authority. At last, access control and policies execution mechanism were studied as the implementation suggestion.展开更多
With the introduction of software defined hardware by DARPA Electronics Resurgence Initiative,software definition will be the basic attribute of information system.Benefiting from boundary certainty and algorithm aggr...With the introduction of software defined hardware by DARPA Electronics Resurgence Initiative,software definition will be the basic attribute of information system.Benefiting from boundary certainty and algorithm aggregation of domain applications,domain-oriented computing architecture has become the technical direction that considers the high flexibility and efficiency of information system.Aiming at the characteristics of data-intensive computing in different scenarios such as Internet of Things(IoT),big data,artificial intelligence(AI),this paper presents a domain-oriented software defined computing architecture,discusses the hierarchical interconnection structure,hybrid granularity computing element and its computational kernel extraction method,finally proves the flexibility and high efficiency of this architecture by experimental comparison.展开更多
Mobile Edge Computing(MEC)assists clouds to handle enormous tasks from mobile devices in close proximity.The edge servers are not allocated efficiently according to the dynamic nature of the network.It leads to process...Mobile Edge Computing(MEC)assists clouds to handle enormous tasks from mobile devices in close proximity.The edge servers are not allocated efficiently according to the dynamic nature of the network.It leads to processing delay,and the tasks are dropped due to time limitations.The researchersfind it difficult and complex to determine the offloading decision because of uncertain load dynamic condition over the edge nodes.The challenge relies on the offload-ing decision on selection of edge nodes for offloading in a centralized manner.This study focuses on minimizing task-processing time while simultaneously increasing the success rate of service provided by edge servers.Initially,a task-offloading problem needs to be formulated based on the communication and pro-cessing.Then offloading decision problem is solved by deep analysis on taskflow in the network and feedback from the devices on edge services.The significance of the model is improved with the modelling of Deep Mobile-X architecture and bi-directional Long Short Term Memory(b-LSTM).The simulation is done in the Edgecloudsim environment,and the outcomes show the significance of the proposed idea.The processing time of the anticipated model is 6.6 s.The following perfor-mance metrics,improved server utilization,the ratio of the dropped task,and number of offloading tasks are evaluated and compared with existing learning approaches.The proposed model shows a better trade-off compared to existing approaches.展开更多
In the last years, architectural practice has been confronted with a paradigm shift towards the application of digital methods in design activities. In this regard, it is a pedagogic challenge to provide a suitable co...In the last years, architectural practice has been confronted with a paradigm shift towards the application of digital methods in design activities. In this regard, it is a pedagogic challenge to provide a suitable computational background for architectural students, to improve their ability to apply algorithmic-parametric logic, as well as fabrication and prototyping resources to design problem solving. This challenge is even stronger when considering less favored social and technological contexts, such as in Brazil, for example. In this scenario, this article presents and discusses the procedures and the results from a didactic experience carried out in a design computing-oriented discipline, inserted in the curriculum of a Brazilian architecture course. Hence, this paper shares some design computing teaching experiences and presents some results on computational methods and creative approaches, with a view to contribute to a better understanding about the relations between logical thinking, mathematics and architectural design processes.展开更多
Cloud Computing has become one of the popular buzzwords in the IT area after Web2.0. This is not a new technology, but the concept that binds different existed technologies altogether including Grid Computing, Utility...Cloud Computing has become one of the popular buzzwords in the IT area after Web2.0. This is not a new technology, but the concept that binds different existed technologies altogether including Grid Computing, Utility Computing, distributed system, virtualization and other mature technique. Business Process Management (BPM) is designed for business management using IT infrastructure to focus on process modeling, monitor and management. BPM is composed of business process, business information and IT resources, which help to build a real-time intelligent system, based on business management and IT technologies. This paper describes theory on Cloud Computing and proposes a BPM implement on Cloud environments.展开更多
Lightweight ubiquitous computing security architecture was presented. Lots of our recent researches have been integrated in this architecture. And the main current researches in the related area have also been absorbe...Lightweight ubiquitous computing security architecture was presented. Lots of our recent researches have been integrated in this architecture. And the main current researches in the related area have also been absorbed. The main attention of this paper was providing a compact and realizable method to apply ubiquitous computing into our daily lives under sufficient secure guarantee. At last,the personal intelligent assistant system was presented to show that this architecture was a suitable and realizable security mechanism in solving the ubiquitous computing problems.展开更多
To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which en...To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which ensured the convenience for the management and usage of data.In order to break through the master node system bottlenecks,a storage system with better performance is designed through introduction of cloud computing technology,which adopts the design of master-slave distribution patterns by the network access according to the recent principle.Thus the burden of single access the master node is reduced.Also file block update strategy and fault recovery mechanism are provided to solve the management bottleneck problem of traditional storage system on the data update and fault recovery and offer feasible technical solutions to storage management for big data.展开更多
Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human ...Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.展开更多
Neuromorphic computing,inspired by the human brain,uses memristor devices for complex tasks.Recent studies show that self-organizing random nanowires can implement neuromorphic information processing,enabling data ana...Neuromorphic computing,inspired by the human brain,uses memristor devices for complex tasks.Recent studies show that self-organizing random nanowires can implement neuromorphic information processing,enabling data analysis.This paper presents a model based on these nanowire networks,with an improved conductance variation profile.We suggest using these networks for temporal information processing via a reservoir computing scheme and propose an efficient data encoding method using voltage pulses.The nanowire network layer generates dynamic behaviors for pulse voltages,allowing time series prediction analysis.Our experiment uses a double stochastic nanowire network architecture for processing multiple input signals,outperforming traditional reservoir computing in terms of fewer nodes,enriched dynamics and improved prediction accuracy.Experimental results confirm the high accuracy of this architecture on multiple real-time series datasets,making neuromorphic nanowire networks promising for physical implementation of reservoir computing.展开更多
We develop universal quantum computing models that form a family of quantum von Neumann architectures,with modular units of memory,control,CPU,and internet,besides input and output.This family contains three generatio...We develop universal quantum computing models that form a family of quantum von Neumann architectures,with modular units of memory,control,CPU,and internet,besides input and output.This family contains three generations characterized by dynamical quantum resource theory,and it also circumvents no-go theorems on quantum programming and control.Besides universality,such a family satisfies other desirable engineering requirements on system and algorithm design,such as modularity and programmability,hence serves as a unique approach to building universal quantum computers.展开更多
Artificial intelligence(AI)processes data-centric applications with minimal effort.However,it poses new challenges to system design in terms of computational speed and energy efficiency.The traditional von Neumann arc...Artificial intelligence(AI)processes data-centric applications with minimal effort.However,it poses new challenges to system design in terms of computational speed and energy efficiency.The traditional von Neumann architecture cannot meet the requirements of heavily datacentric applications due to the separation of computation and storage.The emergence of computing inmemory(CIM)is significant in circumventing the von Neumann bottleneck.A commercialized memory architecture,static random-access memory(SRAM),is fast and robust,consumes less power,and is compatible with state-of-the-art technology.This study investigates the research progress of SRAM-based CIM technology in three levels:circuit,function,and application.It also outlines the problems,challenges,and prospects of SRAM-based CIM macros.展开更多
The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bott...The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.展开更多
Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with...Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with flexible structural unit,ultra-low power consumption,and huge parallelism will be needed.In-memory computing,a non-von Neumann architecture fusing memory units and computing units,can eliminate the data transfer time and energy consumption while performing massive parallel computations.Prototype in-memory computing schemes modified from different memory technologies have shown orders of magnitude improvement in computing efficiency,making it be regarded as the ultimate computing paradigm.Here we review the state-of-the-art memory device technologies potential for in-memory computing,summarize their versatile applications in neural network,stochastic generation,and hybrid precision digital computing,with promising solutions for unprecedented computing tasks,and also discuss the challenges of stability and integration for general in-memory computing.展开更多
Graphics Processing Units(GPUs)are used to accelerate computing-intensive tasks,such as neural networks,data analysis,high-performance computing,etc.In the past decade or so,researchers have done a lot of work on GPU ...Graphics Processing Units(GPUs)are used to accelerate computing-intensive tasks,such as neural networks,data analysis,high-performance computing,etc.In the past decade or so,researchers have done a lot of work on GPU architecture and proposed a variety of theories and methods to study the microarchitectural characteristics of various GPUs.In this study,the GPU serves as a co-processor and works together with the CPU in an embedded real-time system to handle computationally intensive tasks.It models the architecture of the GPU and further considers it based on some excellent work.The SIMT mechanism and Cache-miss situation provide a more detailed analysis of the GPU architecture.In order to verify the GPU architecture model proposed in this article,10 GPU kernel_task and an Nvidia GPU device were used to perform experiments.The experimental results showed that the minimum error between the kernel task execution time predicted by the GPU architecture model proposed in this article and the actual measured kernel task execution time was 3.80%,and the maximum error was 8.30%.展开更多
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N...Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant 62171465,62072303,62272223,U22A2031。
文摘By pushing computation,cache,and network control to the edge,mobile edge computing(MEC)is expected to play a leading role in fifth generation(5G)and future sixth generation(6G).Nevertheless,facing ubiquitous fast-growing computational demands,it is impossible for a single MEC paradigm to effectively support high-quality intelligent services at end user equipments(UEs).To address this issue,we propose an air-ground collaborative MEC(AGCMEC)architecture in this article.The proposed AGCMEC integrates all potentially available MEC servers within air and ground in the envisioned 6G,by a variety of collaborative ways to provide computation services at their best for UEs.Firstly,we introduce the AGC-MEC architecture and elaborate three typical use cases.Then,we discuss four main challenges in the AGC-MEC as well as their potential solutions.Next,we conduct a case study of collaborative service placement for AGC-MEC to validate the effectiveness of the proposed collaborative service placement strategy.Finally,we highlight several potential research directions of the AGC-MEC.
基金This work was supported by the National Research Foundation,Singapore under Award No.NRF-CRP24-2020-0002.
文摘The conventional computing architecture faces substantial chal-lenges,including high latency and energy consumption between memory and processing units.In response,in-memory computing has emerged as a promising alternative architecture,enabling computing operations within memory arrays to overcome these limitations.Memristive devices have gained significant attention as key components for in-memory computing due to their high-density arrays,rapid response times,and ability to emulate biological synapses.Among these devices,two-dimensional(2D)material-based memristor and memtransistor arrays have emerged as particularly promising candidates for next-generation in-memory computing,thanks to their exceptional performance driven by the unique properties of 2D materials,such as layered structures,mechanical flexibility,and the capability to form heterojunctions.This review delves into the state-of-the-art research on 2D material-based memristive arrays,encompassing critical aspects such as material selection,device perfor-mance metrics,array structures,and potential applications.Furthermore,it provides a comprehensive overview of the current challenges and limitations associated with these arrays,along with potential solutions.The primary objective of this review is to serve as a significant milestone in realizing next-generation in-memory computing utilizing 2D materials and bridge the gap from single-device characterization to array-level and system-level implementations of neuromorphic computing,leveraging the potential of 2D material-based memristive devices.
基金supported in part by the National Science and Technology Major Project of the Ministry of Science and Technology of China (Grant No. 2018ZX01028201)in part by the National Natural Science Foundation of China (Grant No. 61672317, No. 61834002)in part by the National Key R&D Program of China (Grant No. 2018YFB2202101)
文摘As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academia and industry.However,dynamic reconfigurable computing is not yet mature because of several unsolved problems.This work introduces the concept,architecture,and compilation techniques of dynamic reconfigurable computing.It also discusses the existing major challenges and points out its potential applications.
基金This project was supported by the National Natural Science Foundation of China (60135020).
文摘The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.
基金supported by National Information Security Program under Grant No.2009A112
文摘Security is a key problem for the development of Cloud Computing. A common service security architecture is a basic abstract to support security research work. The authorization ability in the service security faces more complex and variable users and environment. Based on the multidimensional views, the service security architecture is described on three dimensions of service security requirement integrating security attributes and service layers. An attribute-based dynamic access control model is presented to detail the relationships among subjects, objects, roles, attributes, context and extra factors further. The model uses dynamic control policies to support the multiple roles and flexible authority. At last, access control and policies execution mechanism were studied as the implementation suggestion.
基金supported by National Science and Technology Major Project granted No.2016ZX01012101
文摘With the introduction of software defined hardware by DARPA Electronics Resurgence Initiative,software definition will be the basic attribute of information system.Benefiting from boundary certainty and algorithm aggregation of domain applications,domain-oriented computing architecture has become the technical direction that considers the high flexibility and efficiency of information system.Aiming at the characteristics of data-intensive computing in different scenarios such as Internet of Things(IoT),big data,artificial intelligence(AI),this paper presents a domain-oriented software defined computing architecture,discusses the hierarchical interconnection structure,hybrid granularity computing element and its computational kernel extraction method,finally proves the flexibility and high efficiency of this architecture by experimental comparison.
文摘Mobile Edge Computing(MEC)assists clouds to handle enormous tasks from mobile devices in close proximity.The edge servers are not allocated efficiently according to the dynamic nature of the network.It leads to processing delay,and the tasks are dropped due to time limitations.The researchersfind it difficult and complex to determine the offloading decision because of uncertain load dynamic condition over the edge nodes.The challenge relies on the offload-ing decision on selection of edge nodes for offloading in a centralized manner.This study focuses on minimizing task-processing time while simultaneously increasing the success rate of service provided by edge servers.Initially,a task-offloading problem needs to be formulated based on the communication and pro-cessing.Then offloading decision problem is solved by deep analysis on taskflow in the network and feedback from the devices on edge services.The significance of the model is improved with the modelling of Deep Mobile-X architecture and bi-directional Long Short Term Memory(b-LSTM).The simulation is done in the Edgecloudsim environment,and the outcomes show the significance of the proposed idea.The processing time of the anticipated model is 6.6 s.The following perfor-mance metrics,improved server utilization,the ratio of the dropped task,and number of offloading tasks are evaluated and compared with existing learning approaches.The proposed model shows a better trade-off compared to existing approaches.
文摘In the last years, architectural practice has been confronted with a paradigm shift towards the application of digital methods in design activities. In this regard, it is a pedagogic challenge to provide a suitable computational background for architectural students, to improve their ability to apply algorithmic-parametric logic, as well as fabrication and prototyping resources to design problem solving. This challenge is even stronger when considering less favored social and technological contexts, such as in Brazil, for example. In this scenario, this article presents and discusses the procedures and the results from a didactic experience carried out in a design computing-oriented discipline, inserted in the curriculum of a Brazilian architecture course. Hence, this paper shares some design computing teaching experiences and presents some results on computational methods and creative approaches, with a view to contribute to a better understanding about the relations between logical thinking, mathematics and architectural design processes.
文摘Cloud Computing has become one of the popular buzzwords in the IT area after Web2.0. This is not a new technology, but the concept that binds different existed technologies altogether including Grid Computing, Utility Computing, distributed system, virtualization and other mature technique. Business Process Management (BPM) is designed for business management using IT infrastructure to focus on process modeling, monitor and management. BPM is composed of business process, business information and IT resources, which help to build a real-time intelligent system, based on business management and IT technologies. This paper describes theory on Cloud Computing and proposes a BPM implement on Cloud environments.
基金Key Project of Chinese Ministry of Education (No.104086)
文摘Lightweight ubiquitous computing security architecture was presented. Lots of our recent researches have been integrated in this architecture. And the main current researches in the related area have also been absorbed. The main attention of this paper was providing a compact and realizable method to apply ubiquitous computing into our daily lives under sufficient secure guarantee. At last,the personal intelligent assistant system was presented to show that this architecture was a suitable and realizable security mechanism in solving the ubiquitous computing problems.
文摘To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which ensured the convenience for the management and usage of data.In order to break through the master node system bottlenecks,a storage system with better performance is designed through introduction of cloud computing technology,which adopts the design of master-slave distribution patterns by the network access according to the recent principle.Thus the burden of single access the master node is reduced.Also file block update strategy and fault recovery mechanism are provided to solve the management bottleneck problem of traditional storage system on the data update and fault recovery and offer feasible technical solutions to storage management for big data.
基金supported in part by the National Natural Science Foundation of China (NSFC) under Grant No.61976242in part by the Natural Science Fund of Hebei Province for Distinguished Young Scholars under Grant No.F2021202010+2 种基金in part by the Fundamental Scientific Research Funds for Interdisciplinary Team of Hebei University of Technology under Grant No.JBKYTD2002funded by Science and Technology Project of Hebei Education Department under Grant No.JZX2023007supported by 2022 Interdisciplinary Postgraduate Training Program of Hebei University of Technology under Grant No.HEBUT-YXKJC-2022122.
文摘Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. U20A20227,62076208, and 62076207)Chongqing Talent Plan “Contract System” Project (Grant No. CQYC20210302257)+3 种基金National Key Laboratory of Smart Vehicle Safety Technology Open Fund Project (Grant No. IVSTSKL-202309)the Chongqing Technology Innovation and Application Development Special Major Project (Grant No. CSTB2023TIAD-STX0020)College of Artificial Intelligence, Southwest UniversityState Key Laboratory of Intelligent Vehicle Safety Technology
文摘Neuromorphic computing,inspired by the human brain,uses memristor devices for complex tasks.Recent studies show that self-organizing random nanowires can implement neuromorphic information processing,enabling data analysis.This paper presents a model based on these nanowire networks,with an improved conductance variation profile.We suggest using these networks for temporal information processing via a reservoir computing scheme and propose an efficient data encoding method using voltage pulses.The nanowire network layer generates dynamic behaviors for pulse voltages,allowing time series prediction analysis.Our experiment uses a double stochastic nanowire network architecture for processing multiple input signals,outperforming traditional reservoir computing in terms of fewer nodes,enriched dynamics and improved prediction accuracy.Experimental results confirm the high accuracy of this architecture on multiple real-time series datasets,making neuromorphic nanowire networks promising for physical implementation of reservoir computing.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.12047503 and 12105343)。
文摘We develop universal quantum computing models that form a family of quantum von Neumann architectures,with modular units of memory,control,CPU,and internet,besides input and output.This family contains three generations characterized by dynamical quantum resource theory,and it also circumvents no-go theorems on quantum programming and control.Besides universality,such a family satisfies other desirable engineering requirements on system and algorithm design,such as modularity and programmability,hence serves as a unique approach to building universal quantum computers.
基金the National Key Research and Development Program of China(2018YFB2202602)The State Key Program of the National Natural Science Foundation of China(NO.61934005)+1 种基金The National Natural Science Foundation of China(NO.62074001)Joint Funds of the National Natural Science Foundation of China under Grant U19A2074.
文摘Artificial intelligence(AI)processes data-centric applications with minimal effort.However,it poses new challenges to system design in terms of computational speed and energy efficiency.The traditional von Neumann architecture cannot meet the requirements of heavily datacentric applications due to the separation of computation and storage.The emergence of computing inmemory(CIM)is significant in circumventing the von Neumann bottleneck.A commercialized memory architecture,static random-access memory(SRAM),is fast and robust,consumes less power,and is compatible with state-of-the-art technology.This study investigates the research progress of SRAM-based CIM technology in three levels:circuit,function,and application.It also outlines the problems,challenges,and prospects of SRAM-based CIM macros.
基金supported by the National Natural Science Foundation of China(Nos.62034006,91964105,61874068)the China Key Research and Development Program(No.2016YFA0201802)+1 种基金the Natural Science Foundation of Shandong Province(No.ZR2020JQ28)Program of Qilu Young Scholars of Shandong University。
文摘The“memory wall”of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution,while in-memory computing(IMC)architecture is a promising approach to breaking the bottleneck.Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures,stochastic computing(SC)can compensate for these shortcomings due to its low sensitivity to cell disturbances.Furthermore,massive parallel computing can be processed to improve the speed and efficiency of the system.In this paper,by designing logic functions in NOR flash arrays,SC in IMC for the image edge detection is realized,demonstrating ultra-low computational complexity and power consumption(25.5 fJ/pixel at 2-bit sequence length).More impressively,the noise immunity is 6 times higher than that of the traditional binary method,showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61925402 and 61851402)Science and Technology Commission of Shanghai Municipality,China(Grant No.19JC1416600)+1 种基金the National Key Research and Development Program of China(Grant No.2017YFB0405600)Shanghai Education Development Foundation and Shanghai Municipal Education Commission Shuguang Program,China(Grant No.18SG01).
文摘Facing the computing demands of Internet of things(IoT)and artificial intelligence(AI),the cost induced by moving the data between the central processing unit(CPU)and memory is the key problem and a chip featured with flexible structural unit,ultra-low power consumption,and huge parallelism will be needed.In-memory computing,a non-von Neumann architecture fusing memory units and computing units,can eliminate the data transfer time and energy consumption while performing massive parallel computations.Prototype in-memory computing schemes modified from different memory technologies have shown orders of magnitude improvement in computing efficiency,making it be regarded as the ultimate computing paradigm.Here we review the state-of-the-art memory device technologies potential for in-memory computing,summarize their versatile applications in neural network,stochastic generation,and hybrid precision digital computing,with promising solutions for unprecedented computing tasks,and also discuss the challenges of stability and integration for general in-memory computing.
文摘Graphics Processing Units(GPUs)are used to accelerate computing-intensive tasks,such as neural networks,data analysis,high-performance computing,etc.In the past decade or so,researchers have done a lot of work on GPU architecture and proposed a variety of theories and methods to study the microarchitectural characteristics of various GPUs.In this study,the GPU serves as a co-processor and works together with the CPU in an embedded real-time system to handle computationally intensive tasks.It models the architecture of the GPU and further considers it based on some excellent work.The SIMT mechanism and Cache-miss situation provide a more detailed analysis of the GPU architecture.In order to verify the GPU architecture model proposed in this article,10 GPU kernel_task and an Nvidia GPU device were used to perform experiments.The experimental results showed that the minimum error between the kernel task execution time predicted by the GPU architecture model proposed in this article and the actual measured kernel task execution time was 3.80%,and the maximum error was 8.30%.
基金supported by the National Natural Science Foundation of China (No.11172134)the Funding of Jiangsu Innovation Program for Graduate Education (No.CXLX13_132)
文摘Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially.