Recently,due to the availability of big data and the rapid growth of computing power,artificial intelligence(AI)has regained tremendous attention and investment.Machine learning(ML)approaches have been successfully ap...Recently,due to the availability of big data and the rapid growth of computing power,artificial intelligence(AI)has regained tremendous attention and investment.Machine learning(ML)approaches have been successfully applied to solve many problems in academia and in industry.Although the explosion of big data applications is driving the development of ML,it also imposes severe challenges of data processing speed and scalability on conventional computer systems.Computing platforms that are dedicatedly designed for AI applications have been considered,ranging from a complement to von Neumann platforms to a“must-have”and stand-alone technical solution.These platforms,which belong to a larger category named“domain-specific computing,”focus on specific customization for AI.In this article,we focus on summarizing the recent advances in accelerator designs for deep neural networks(DNNs)-that is,DNN accelerators.We discuss various architectures that support DNN executions in terms of computing units,dataflow optimization,targeted network topologies,architectures on emerging technologies,and accelerators for emerging applications.We also provide our visions on the future trend of AI chip designs.展开更多
How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classif...How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.展开更多
Plant disease classification based on digital pictures is challenging.Machine learning approaches and plant image categorization technologies such as deep learning have been utilized to recognize,identify,and diagnose...Plant disease classification based on digital pictures is challenging.Machine learning approaches and plant image categorization technologies such as deep learning have been utilized to recognize,identify,and diagnose plant diseases in the previous decade.Increasing the yield quantity and quality of rice forming is an important cause for the paddy production countries.However,some diseases that are blocking the improvement in paddy production are considered as an ominous threat.Convolution Neural Network(CNN)has shown a remarkable performance in solving the early detection of paddy leaf diseases based on its images in the fast-growing era of science and technology.Nevertheless,the significant CNN architectures construction is dependent on expertise in a neural network and domain knowledge.This approach is time-consuming,and high computational resources are mandatory.In this research,we propose a novel method based on Mutant Particle swarm optimization(MUT-PSO)Algorithms to search for an optimum CNN architecture for Paddy leaf disease classification.Experimentation results show that Mutant Particle swarm optimization Convolution Neural Network(MUTPSO-CNN)can find optimumCNNarchitecture that offers better performance than existing hand-crafted CNN architectures in terms of accuracy,precision/recall,and execution time.展开更多
Research into automatically searching for an optimal neural network(NN)by optimi-sation algorithms is a significant research topic in deep learning and artificial intelligence.However,this is still challenging due to ...Research into automatically searching for an optimal neural network(NN)by optimi-sation algorithms is a significant research topic in deep learning and artificial intelligence.However,this is still challenging due to two issues:Both the hyperparameter and ar-chitecture should be optimised and the optimisation process is computationally expen-sive.To tackle these two issues,this paper focusses on solving the hyperparameter and architecture optimization problem for the NN and proposes a novel light‐weight scale‐adaptive fitness evaluation‐based particle swarm optimisation(SAFE‐PSO)approach.Firstly,the SAFE‐PSO algorithm considers the hyperparameters and architectures together in the optimisation problem and therefore can find their optimal combination for the globally best NN.Secondly,the computational cost can be reduced by using multi‐scale accuracy evaluation methods to evaluate candidates.Thirdly,a stagnation‐based switch strategy is proposed to adaptively switch different evaluation methods to better balance the search performance and computational cost.The SAFE‐PSO algorithm is tested on two widely used datasets:The 10‐category(i.e.,CIFAR10)and the 100−cate-gory(i.e.,CIFAR100).The experimental results show that SAFE‐PSO is very effective and efficient,which can not only find a promising NN automatically but also find a better NN than compared algorithms at the same computational cost.展开更多
The dynamic working process of 52SFZ-140-207B type of hydraulic bumper isanalyzed. The modeling method using architecture-based neural networks is introduced. Using thismodeling method, the dynamic model of the hydrau...The dynamic working process of 52SFZ-140-207B type of hydraulic bumper isanalyzed. The modeling method using architecture-based neural networks is introduced. Using thismodeling method, the dynamic model of the hydraulic bumper is established; Based on this model thestructural parameters of the hydraulic bumper are optimized with Genetic algorithm. The result showsthat the performance of the dynamic model is close to that of the hydraulic bumper, and the dynamicperformance of the hydraulic bumper is improved through parameter optimization.展开更多
Side channel attacks(SCAs)on neural networks(NNs)are particularly efficient for retrieving secret information from NNs.We differentiate multiple types of threat scenarios regarding what kind of information is availabl...Side channel attacks(SCAs)on neural networks(NNs)are particularly efficient for retrieving secret information from NNs.We differentiate multiple types of threat scenarios regarding what kind of information is available before the attack and its purpose:recovering hyperparameters(the architecture)of the targeted NN,its weights(parameters),or its inputs.In this survey article,we consider the most relevant attacks to extract the architecture of CNNs.We also categorize SCAs,depending on access with respect to the victim:physical,local,or remote.Attacks targeting the architecture via local SCAs are most common.As of today,physical access seems necessary to retrieve the weights of an NN.We notably describe cache attacks,which are local SCAs aiming to extract the NN's underlying architecture.Few countermeasures have emerged;these are presented at the end of the survey.展开更多
Deep neural networks often outperform classical machine learning algorithms in solving real-world problems.However,designing better networks usually requires domain expertise and consumes significant time and com-puti...Deep neural networks often outperform classical machine learning algorithms in solving real-world problems.However,designing better networks usually requires domain expertise and consumes significant time and com-puting resources.Moreover,when the task changes,the original network architecture becomes outdated and requires redesigning.Thus,Neural Architecture Search(NAS)has gained attention as an effective approach to automatically generate optimal network architectures.Most NAS methods mainly focus on achieving high performance while ignoring architectural complexity.A myriad of research has revealed that network performance and structural complexity are often positively correlated.Nevertheless,complex network structures will bring enormous computing resources.To cope with this,we formulate the neural architecture search task as a multi-objective optimization problem,where an optimal architecture is learned by minimizing the classification error rate and the number of network parameters simultaneously.And then a decomposition-based multi-objective stochastic fractal search method is proposed to solve it.In view of the discrete property of the NAS problem,we discretize the stochastic fractal search step size so that the network architecture can be optimized more effectively.Additionally,two distinct update methods are employed in step size update stage to enhance the global and local search abilities adaptively.Furthermore,an information exchange mechanism between architectures is raised to accelerate the convergence process and improve the efficiency of the algorithm.Experimental studies show that the proposed algorithm has competitive performance comparable to many existing manual and automatic deep neural network generation approaches,which achieved a parameter-less and high-precision architecture with low-cost on each of the six benchmark datasets.展开更多
Convolutional neural network (CNN) is an essential model to achieve high accuracy in various machine learning applications, such as image recognition and natural language processing. One of the important issues for CN...Convolutional neural network (CNN) is an essential model to achieve high accuracy in various machine learning applications, such as image recognition and natural language processing. One of the important issues for CNN acceleration with high energy efficiency and processing performance is efficient data reuse by exploiting the inherent data locality. In this paper, we propose a novel CGRA (Coarse Grained Reconfigurable Array) architecture with time-domain multithreading for exploiting input data locality. The multithreading on each processing element enables the input data reusing through multiple computation periods. This paper presents the accelerator design performance analysis of the proposed architecture. We examine the structure of memory subsystems, as well as the architecture of the computing array, to supply required data with minimal performance overhead. We explore efficient architecture design alternatives based on the characteristics of modern CNN configurations. The evaluation results show that the available bandwidth of the external memory can be utilized efficiently when the output plane is wider (in earlier layers of many CNNs) while the input data locality can be utilized maximally when the number of output channel is larger (in later layers).展开更多
Driven by continuous scaling of nanoscale semiconductor technologies,the past years have witnessed the progressive advancement of machine learning techniques and applications.Recently,dedicated machine learning accele...Driven by continuous scaling of nanoscale semiconductor technologies,the past years have witnessed the progressive advancement of machine learning techniques and applications.Recently,dedicated machine learning accelerators,especially for neural networks,have attracted the research interests of computer architects and VLSI designers.State-of-the-art accelerators increase performance by deploying a huge amount of processing elements,however still face the issue of degraded resource utilization across hybrid and non-standard algorithmic kernels.In this work,we exploit the properties of important neural network kernels for both perception and control to propose a reconfigurable dataflow processor,which adjusts the patterns of data flowing,functionalities of processing elements and on-chip storages according to network kernels.In contrast to stateof-the-art fine-grained data flowing techniques,the proposed coarse-grained dataflow reconfiguration approach enables extensive sharing of computing and storage resources.Three hybrid networks for MobileNet,deep reinforcement learning and sequence classification are constructed and analyzed with customized instruction sets and toolchain.A test chip has been designed and fabricated under UMC 65 nm CMOS technology,with the measured power consumption of 7.51 mW under 100 MHz frequency on a die size of 1.8×1.8 mm^2.展开更多
A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary netw...A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary network adopted in the paper can overcome the disadvantage of traditional neural network with small inputs. The whole image is considered as the input of the neural network, so the maximal features can be kept for recognition. To speed up the recognition process of the neural network, a fast implementation of the partially connected neural network was conducted on NVIDIA Tesla C1060 using the NVIDIA compute unified device architecture (CUDA) framework. Image sets of eight biological species were obtained to test the GPU implementation and counterpart serial CPU implementation, and experiment results showed GPU implementation works effectively on both recognition rate and speed, and gained 343 speedup over its counterpart CPU implementation. Comparing to feature-based recognition method on the same recognition task, the method also achieved an acceptable correct rate of 84.6% when testing on eight biological species.展开更多
This study presents a deep learning model for efficient intracranial hemorrhage(ICH)detection and subtype classification on non-contrast head computed tomography(CT)images.ICH refers to bleeding in the skull,leading t...This study presents a deep learning model for efficient intracranial hemorrhage(ICH)detection and subtype classification on non-contrast head computed tomography(CT)images.ICH refers to bleeding in the skull,leading to the most critical life-threatening health condition requiring rapid and accurate diagnosis.It is classified as intra-axial hemorrhage(intraventricular,intraparenchymal)and extra-axial hemorrhage(subdural,epidural,subarachnoid)based on the bleeding location inside the skull.Many computer-aided diagnoses(CAD)-based schemes have been proposed for ICH detection and classification at both slice and scan levels.However,these approaches performonly binary classification and suffer from a large number of parameters,which increase storage costs.Further,the accuracy of brain hemorrhage detection in existing models is significantly low for medically critical applications.To overcome these problems,a fast and efficient system for the automatic detection of ICH is needed.We designed a double-branch model based on xception architecture that extracts spatial and instant features,concatenates them,and creates the 3D spatial context(common feature vectors)fed to a decision tree classifier for final predictions.The data employed for the experimentation was gathered during the 2019 Radiologist Society of North America(RSNA)brain hemorrhage detection challenge.Our model outperformed benchmark models and achieved better accuracy in intraventricular(99.49%),subarachnoid(99.49%),intraparenchymal(99.10%),and subdural(98.09%)categories,thereby justifying the performance of the proposed double-branch xception architecture for ICH detection and classification.展开更多
Enterprise Information System management has become an increasingly vital factor for many firms. Several organizations have encountered problems when attempting to evaluate organizational performance. Measurement of p...Enterprise Information System management has become an increasingly vital factor for many firms. Several organizations have encountered problems when attempting to evaluate organizational performance. Measurement of performance metrics is a key challenge for a huge number of firms. In order to preserve relevance and adaptability in competitive markets, it has become essential to respond proactively to complex events through informed decision-making that is supported by technology. Therefore, the objective of this study was to apply neural networks to the modeling, simulation, and forecasting of the effects of the performance indicators of Enterprise Information Systems on the achievement of corporate objectives and value creation. A set of quantifiable and sizeable conditionally independent associations were derived using a simplified joint probability distribution technique. Bayesian Neural Networks were utilized to describe the link between random variables (features) and to concisely and easily specify the joint probability distribution. The research demonstrated that Bayesian networks could effectively explore complex logical linkages by employing probability to represent uncertainty and probabilistic rules;and by applying impact models from Bayesian taxonomies to achieve learning and reasoning processes.展开更多
This research work investigated comparative studies of expert system design and control of crude oil distillation column (CODC) using artificial neural networks based Monte Carlo (ANNBMC) simulation of random processe...This research work investigated comparative studies of expert system design and control of crude oil distillation column (CODC) using artificial neural networks based Monte Carlo (ANNBMC) simulation of random processes and artificial neural networks (ANN) model which were validated using experimental data obtained from functioning crude oil distillation column of Port-Harcourt Refinery, Nigeria by MATLAB computer program. Ninety percent (90%) of the experimental data sets were used for training while ten percent (10%) were used for testing the networks. The maximum relative errors between the experimental and calculated data obtained from the output variables of the neural network for CODC design were 1.98 error % and 0.57 error % when ANN only and ANNBMC were used respectively while their respective values for the maximum relative error were 0.346 error % and 0.124 error % when they were used for the controller prediction. Larger number of iteration steps of below 2500 and 5000 were required to achieve convergence of less than 10-7?for the training error using ANNBMC for both the design of the CODC and controller respectively while less than 400 and 700 iteration steps were needed to achieve convergence of 10-4?using ANN only. The linear regression analysis performed revealed the minimum and maximum prediction accuracies to be 80.65% and 98.79%;and 98.38% and 99.98% when ANN and ANNBMC were used for the CODC design respectively. Also, the minimum and maximum prediction accuracies were 92.83% and 99.34%;and 98.89% and 99.71% when ANN and ANNBMC were used for the CODC controller respectively as both methodologies have excellent predictions. Hence, artificial neural networks based Monte Carlo simulation is an effective and better tool for the design and control of crude oil distillation column.展开更多
One of the main concerns in Engineering nowadays is the development of aircrafts of low consumption and high performance. For this purpose, airfoils are studied and designed to have an elevated lift coefficient and a ...One of the main concerns in Engineering nowadays is the development of aircrafts of low consumption and high performance. For this purpose, airfoils are studied and designed to have an elevated lift coefficient and a low drag coefficient, thus generating a highly efficient airfoil. The higher the efficiency value is, the lower the aircraft fuel consumption will be; thus improving its performance. In this sense, this work aims to develop a tool for airfoil creation from some desired characteristics, such as the lift and drag coefficients and maximum efficiency rate, using an algorithm based on an ANN (artificial neural network). In order to do so, a database of aerodynamic characteristics with a total of 300 airfoils was initially collected from the XFoil software. Then, through a routine implemented in the MATLAB software, network architectures of one, two, three and four modules were trained, using the back propagation algorithm and momentum. The cross-validation technique was applied to analyze the results, evaluating which network possesses the lowest value in RMS (root-mean-square) error. In this case, the best result obtained was from the two-module architecture with two hidden neuron layers. The airfoils developed by this network, in the regions with the lowest RMS, were compared to the same airfoils imported to XFoil. The presented work offers as a contribution, in relation to other works involving ANN applied to fluid mechanics, the development of airfoils from their aerodynamic characteristics.展开更多
Nowadays,the most heterogeneous architectures were made up by the various IP modules of different hardware vendors,but this model is less efficiently.In order to solve this problem,AMD joint other hardware vendors pro...Nowadays,the most heterogeneous architectures were made up by the various IP modules of different hardware vendors,but this model is less efficiently.In order to solve this problem,AMD joint other hardware vendors proposed heterogeneous system architecture(HSA)specification.On the one hand,the HSA could help developers to accelerate the design process and programming.On the other hand,it improved the system performance and reduced the power.In this paper we presented the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks(CNNs)on the HSA,on the basis of implementation,we presented tow accelerated methods that are Online update weights and letting CPU to participate in calculation.Experimental results showed that the implementation of CNNs on HSA 4 to 10 times faster than on the CPU.展开更多
The brain-inspired spiking neural network (SNN) computing paradigm offers the potential for low-power and scalable computing, suited to many intelligent tasks that conventional computational systems find difficult. ...The brain-inspired spiking neural network (SNN) computing paradigm offers the potential for low-power and scalable computing, suited to many intelligent tasks that conventional computational systems find difficult. On the other hand, NoC (network-on-chips) based very large scale integration (VLSI) systems have been widely used to mimic neuro- biological architectures (including SNNs). This paper proposes an evaluation methodology for SNN applications from the aspect of micro-architecture. First, we extract accurate SNN models from existing simulators of neural systems, second, a cycle-accurate NoC simulator is implemented to execute the aforementioned SNN applications to get timing and energy-consumption information. We believe this method not only benefits the exploration of NoC design space but also bridges the gap between applications (especially those from the neuroscientists' community) and neuromorphic hardware. Based on the method, we have evaluated some typical SNNs in terms of timing and energy. The method is valuable for the development of neuromorphic hardware and applications.展开更多
Evolutionary neural network(ENN)shows high performance in function optimization and in finding approximately global optima from searching large and complex spaces.It is one of the most efficient and adaptive optimizat...Evolutionary neural network(ENN)shows high performance in function optimization and in finding approximately global optima from searching large and complex spaces.It is one of the most efficient and adaptive optimization techniques used widely to provide candidate solutions that lead to the fitness of the problem.ENN has the extraordinary ability to search the global and learning the approximate optimal solution regardless of the gradient information of the error functions.However,ENN requires high computation and processing which requires parallel processing platforms such as field programmable gate arrays(FPGAs)and graphic processing units(GPUs)to achieve a good performance.This work involves different new implementations of ENN by exploring and adopting different techniques and opportunities for parallel processing.Different versions of ENN algorithm have also been implemented and parallelized on FPGAs platform for low latency by exploiting the parallelism and pipelining approaches.Real data form mass spectrometry data(MSD)application was tested to examine and verify our implementations.This is a very important and extensive computation application which needs to search and find the optimal features(peaks)in MSD in order to distinguish cancer patients from control patients.ENN algorithm is also implemented and parallelized on single core and GPU platforms for comparison purposes.The computation time of our optimized algorithm on FPGA and GPU has been improved by a factor of 6.75 and 6,respectively.展开更多
基金the National Science Foundations(NSFs)(1822085,1725456,1816833,1500848,1719160,and 1725447)the NSF Computing and Communication Foundations(1740352)+1 种基金the Nanoelectronics COmputing REsearch Program in the Semiconductor Research Corporation(NC-2766-A)the Center for Research in Intelligent Storage and Processing-in-Memory,one of six centers in the Joint University Microelectronics Program,a SRC program sponsored by Defense Advanced Research Projects Agency.
文摘Recently,due to the availability of big data and the rapid growth of computing power,artificial intelligence(AI)has regained tremendous attention and investment.Machine learning(ML)approaches have been successfully applied to solve many problems in academia and in industry.Although the explosion of big data applications is driving the development of ML,it also imposes severe challenges of data processing speed and scalability on conventional computer systems.Computing platforms that are dedicatedly designed for AI applications have been considered,ranging from a complement to von Neumann platforms to a“must-have”and stand-alone technical solution.These platforms,which belong to a larger category named“domain-specific computing,”focus on specific customization for AI.In this article,we focus on summarizing the recent advances in accelerator designs for deep neural networks(DNNs)-that is,DNN accelerators.We discuss various architectures that support DNN executions in terms of computing units,dataflow optimization,targeted network topologies,architectures on emerging technologies,and accelerators for emerging applications.We also provide our visions on the future trend of AI chip designs.
基金supported by the National Natural Science Foundation of China(U1435220)
文摘How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.
基金The authors received funding source for this research activity under Multi-Disciplinary Research(MDR)Grant Vot H483 from Research Management Centre(RMC)office,Universiti Tun Hussein Onn Malaysia(UTHM).
文摘Plant disease classification based on digital pictures is challenging.Machine learning approaches and plant image categorization technologies such as deep learning have been utilized to recognize,identify,and diagnose plant diseases in the previous decade.Increasing the yield quantity and quality of rice forming is an important cause for the paddy production countries.However,some diseases that are blocking the improvement in paddy production are considered as an ominous threat.Convolution Neural Network(CNN)has shown a remarkable performance in solving the early detection of paddy leaf diseases based on its images in the fast-growing era of science and technology.Nevertheless,the significant CNN architectures construction is dependent on expertise in a neural network and domain knowledge.This approach is time-consuming,and high computational resources are mandatory.In this research,we propose a novel method based on Mutant Particle swarm optimization(MUT-PSO)Algorithms to search for an optimum CNN architecture for Paddy leaf disease classification.Experimentation results show that Mutant Particle swarm optimization Convolution Neural Network(MUTPSO-CNN)can find optimumCNNarchitecture that offers better performance than existing hand-crafted CNN architectures in terms of accuracy,precision/recall,and execution time.
基金supported in part by the National Key Research and Development Program of China under Grant 2019YFB2102102in part by the National Natural Science Foundations of China under Grant 62176094 and Grant 61873097+2 种基金in part by the Key‐Area Research and Development of Guangdong Province under Grant 2020B010166002in part by the Guangdong Natural Science Foundation Research Team under Grant 2018B030312003in part by the Guangdong‐Hong Kong Joint Innovation Platform under Grant 2018B050502006.
文摘Research into automatically searching for an optimal neural network(NN)by optimi-sation algorithms is a significant research topic in deep learning and artificial intelligence.However,this is still challenging due to two issues:Both the hyperparameter and ar-chitecture should be optimised and the optimisation process is computationally expen-sive.To tackle these two issues,this paper focusses on solving the hyperparameter and architecture optimization problem for the NN and proposes a novel light‐weight scale‐adaptive fitness evaluation‐based particle swarm optimisation(SAFE‐PSO)approach.Firstly,the SAFE‐PSO algorithm considers the hyperparameters and architectures together in the optimisation problem and therefore can find their optimal combination for the globally best NN.Secondly,the computational cost can be reduced by using multi‐scale accuracy evaluation methods to evaluate candidates.Thirdly,a stagnation‐based switch strategy is proposed to adaptively switch different evaluation methods to better balance the search performance and computational cost.The SAFE‐PSO algorithm is tested on two widely used datasets:The 10‐category(i.e.,CIFAR10)and the 100−cate-gory(i.e.,CIFAR100).The experimental results show that SAFE‐PSO is very effective and efficient,which can not only find a promising NN automatically but also find a better NN than compared algorithms at the same computational cost.
文摘The dynamic working process of 52SFZ-140-207B type of hydraulic bumper isanalyzed. The modeling method using architecture-based neural networks is introduced. Using thismodeling method, the dynamic model of the hydraulic bumper is established; Based on this model thestructural parameters of the hydraulic bumper are optimized with Genetic algorithm. The result showsthat the performance of the dynamic model is close to that of the hydraulic bumper, and the dynamicperformance of the hydraulic bumper is improved through parameter optimization.
文摘Side channel attacks(SCAs)on neural networks(NNs)are particularly efficient for retrieving secret information from NNs.We differentiate multiple types of threat scenarios regarding what kind of information is available before the attack and its purpose:recovering hyperparameters(the architecture)of the targeted NN,its weights(parameters),or its inputs.In this survey article,we consider the most relevant attacks to extract the architecture of CNNs.We also categorize SCAs,depending on access with respect to the victim:physical,local,or remote.Attacks targeting the architecture via local SCAs are most common.As of today,physical access seems necessary to retrieve the weights of an NN.We notably describe cache attacks,which are local SCAs aiming to extract the NN's underlying architecture.Few countermeasures have emerged;these are presented at the end of the survey.
基金supported by the China Postdoctoral Science Foundation Funded Project(Grant Nos.2017M613054 and 2017M613053)the Shaanxi Postdoctoral Science Foundation Funded Project(Grant No.2017BSHYDZZ33)the National Science Foundation of China(Grant No.62102239).
文摘Deep neural networks often outperform classical machine learning algorithms in solving real-world problems.However,designing better networks usually requires domain expertise and consumes significant time and com-puting resources.Moreover,when the task changes,the original network architecture becomes outdated and requires redesigning.Thus,Neural Architecture Search(NAS)has gained attention as an effective approach to automatically generate optimal network architectures.Most NAS methods mainly focus on achieving high performance while ignoring architectural complexity.A myriad of research has revealed that network performance and structural complexity are often positively correlated.Nevertheless,complex network structures will bring enormous computing resources.To cope with this,we formulate the neural architecture search task as a multi-objective optimization problem,where an optimal architecture is learned by minimizing the classification error rate and the number of network parameters simultaneously.And then a decomposition-based multi-objective stochastic fractal search method is proposed to solve it.In view of the discrete property of the NAS problem,we discretize the stochastic fractal search step size so that the network architecture can be optimized more effectively.Additionally,two distinct update methods are employed in step size update stage to enhance the global and local search abilities adaptively.Furthermore,an information exchange mechanism between architectures is raised to accelerate the convergence process and improve the efficiency of the algorithm.Experimental studies show that the proposed algorithm has competitive performance comparable to many existing manual and automatic deep neural network generation approaches,which achieved a parameter-less and high-precision architecture with low-cost on each of the six benchmark datasets.
文摘Convolutional neural network (CNN) is an essential model to achieve high accuracy in various machine learning applications, such as image recognition and natural language processing. One of the important issues for CNN acceleration with high energy efficiency and processing performance is efficient data reuse by exploiting the inherent data locality. In this paper, we propose a novel CGRA (Coarse Grained Reconfigurable Array) architecture with time-domain multithreading for exploiting input data locality. The multithreading on each processing element enables the input data reusing through multiple computation periods. This paper presents the accelerator design performance analysis of the proposed architecture. We examine the structure of memory subsystems, as well as the architecture of the computing array, to supply required data with minimal performance overhead. We explore efficient architecture design alternatives based on the characteristics of modern CNN configurations. The evaluation results show that the available bandwidth of the external memory can be utilized efficiently when the output plane is wider (in earlier layers of many CNNs) while the input data locality can be utilized maximally when the number of output channel is larger (in later layers).
基金supported by NSFC with Grant No. 61702493, 51707191Science and Technology Planning Project of Guangdong Province with Grant No. 2018B030338001+2 种基金Shenzhen S&T Funding with Grant No. KQJSCX20170731163915914Basic Research Program No. JCYJ20170818164527303, JCYJ20180507182619669SIAT Innovation Program for Excellent Young Researchers with Grant No. 2017001
文摘Driven by continuous scaling of nanoscale semiconductor technologies,the past years have witnessed the progressive advancement of machine learning techniques and applications.Recently,dedicated machine learning accelerators,especially for neural networks,have attracted the research interests of computer architects and VLSI designers.State-of-the-art accelerators increase performance by deploying a huge amount of processing elements,however still face the issue of degraded resource utilization across hybrid and non-standard algorithmic kernels.In this work,we exploit the properties of important neural network kernels for both perception and control to propose a reconfigurable dataflow processor,which adjusts the patterns of data flowing,functionalities of processing elements and on-chip storages according to network kernels.In contrast to stateof-the-art fine-grained data flowing techniques,the proposed coarse-grained dataflow reconfiguration approach enables extensive sharing of computing and storage resources.Three hybrid networks for MobileNet,deep reinforcement learning and sequence classification are constructed and analyzed with customized instruction sets and toolchain.A test chip has been designed and fabricated under UMC 65 nm CMOS technology,with the measured power consumption of 7.51 mW under 100 MHz frequency on a die size of 1.8×1.8 mm^2.
基金National Natural Science Foundation of China (No. 60975084)Natural Science Foundation of Fujian Province,China (No.2011J05159)
文摘A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary network adopted in the paper can overcome the disadvantage of traditional neural network with small inputs. The whole image is considered as the input of the neural network, so the maximal features can be kept for recognition. To speed up the recognition process of the neural network, a fast implementation of the partially connected neural network was conducted on NVIDIA Tesla C1060 using the NVIDIA compute unified device architecture (CUDA) framework. Image sets of eight biological species were obtained to test the GPU implementation and counterpart serial CPU implementation, and experiment results showed GPU implementation works effectively on both recognition rate and speed, and gained 343 speedup over its counterpart CPU implementation. Comparing to feature-based recognition method on the same recognition task, the method also achieved an acceptable correct rate of 84.6% when testing on eight biological species.
文摘This study presents a deep learning model for efficient intracranial hemorrhage(ICH)detection and subtype classification on non-contrast head computed tomography(CT)images.ICH refers to bleeding in the skull,leading to the most critical life-threatening health condition requiring rapid and accurate diagnosis.It is classified as intra-axial hemorrhage(intraventricular,intraparenchymal)and extra-axial hemorrhage(subdural,epidural,subarachnoid)based on the bleeding location inside the skull.Many computer-aided diagnoses(CAD)-based schemes have been proposed for ICH detection and classification at both slice and scan levels.However,these approaches performonly binary classification and suffer from a large number of parameters,which increase storage costs.Further,the accuracy of brain hemorrhage detection in existing models is significantly low for medically critical applications.To overcome these problems,a fast and efficient system for the automatic detection of ICH is needed.We designed a double-branch model based on xception architecture that extracts spatial and instant features,concatenates them,and creates the 3D spatial context(common feature vectors)fed to a decision tree classifier for final predictions.The data employed for the experimentation was gathered during the 2019 Radiologist Society of North America(RSNA)brain hemorrhage detection challenge.Our model outperformed benchmark models and achieved better accuracy in intraventricular(99.49%),subarachnoid(99.49%),intraparenchymal(99.10%),and subdural(98.09%)categories,thereby justifying the performance of the proposed double-branch xception architecture for ICH detection and classification.
文摘Enterprise Information System management has become an increasingly vital factor for many firms. Several organizations have encountered problems when attempting to evaluate organizational performance. Measurement of performance metrics is a key challenge for a huge number of firms. In order to preserve relevance and adaptability in competitive markets, it has become essential to respond proactively to complex events through informed decision-making that is supported by technology. Therefore, the objective of this study was to apply neural networks to the modeling, simulation, and forecasting of the effects of the performance indicators of Enterprise Information Systems on the achievement of corporate objectives and value creation. A set of quantifiable and sizeable conditionally independent associations were derived using a simplified joint probability distribution technique. Bayesian Neural Networks were utilized to describe the link between random variables (features) and to concisely and easily specify the joint probability distribution. The research demonstrated that Bayesian networks could effectively explore complex logical linkages by employing probability to represent uncertainty and probabilistic rules;and by applying impact models from Bayesian taxonomies to achieve learning and reasoning processes.
文摘This research work investigated comparative studies of expert system design and control of crude oil distillation column (CODC) using artificial neural networks based Monte Carlo (ANNBMC) simulation of random processes and artificial neural networks (ANN) model which were validated using experimental data obtained from functioning crude oil distillation column of Port-Harcourt Refinery, Nigeria by MATLAB computer program. Ninety percent (90%) of the experimental data sets were used for training while ten percent (10%) were used for testing the networks. The maximum relative errors between the experimental and calculated data obtained from the output variables of the neural network for CODC design were 1.98 error % and 0.57 error % when ANN only and ANNBMC were used respectively while their respective values for the maximum relative error were 0.346 error % and 0.124 error % when they were used for the controller prediction. Larger number of iteration steps of below 2500 and 5000 were required to achieve convergence of less than 10-7?for the training error using ANNBMC for both the design of the CODC and controller respectively while less than 400 and 700 iteration steps were needed to achieve convergence of 10-4?using ANN only. The linear regression analysis performed revealed the minimum and maximum prediction accuracies to be 80.65% and 98.79%;and 98.38% and 99.98% when ANN and ANNBMC were used for the CODC design respectively. Also, the minimum and maximum prediction accuracies were 92.83% and 99.34%;and 98.89% and 99.71% when ANN and ANNBMC were used for the CODC controller respectively as both methodologies have excellent predictions. Hence, artificial neural networks based Monte Carlo simulation is an effective and better tool for the design and control of crude oil distillation column.
文摘One of the main concerns in Engineering nowadays is the development of aircrafts of low consumption and high performance. For this purpose, airfoils are studied and designed to have an elevated lift coefficient and a low drag coefficient, thus generating a highly efficient airfoil. The higher the efficiency value is, the lower the aircraft fuel consumption will be; thus improving its performance. In this sense, this work aims to develop a tool for airfoil creation from some desired characteristics, such as the lift and drag coefficients and maximum efficiency rate, using an algorithm based on an ANN (artificial neural network). In order to do so, a database of aerodynamic characteristics with a total of 300 airfoils was initially collected from the XFoil software. Then, through a routine implemented in the MATLAB software, network architectures of one, two, three and four modules were trained, using the back propagation algorithm and momentum. The cross-validation technique was applied to analyze the results, evaluating which network possesses the lowest value in RMS (root-mean-square) error. In this case, the best result obtained was from the two-module architecture with two hidden neuron layers. The airfoils developed by this network, in the regions with the lowest RMS, were compared to the same airfoils imported to XFoil. The presented work offers as a contribution, in relation to other works involving ANN applied to fluid mechanics, the development of airfoils from their aerodynamic characteristics.
文摘Nowadays,the most heterogeneous architectures were made up by the various IP modules of different hardware vendors,but this model is less efficiently.In order to solve this problem,AMD joint other hardware vendors proposed heterogeneous system architecture(HSA)specification.On the one hand,the HSA could help developers to accelerate the design process and programming.On the other hand,it improved the system performance and reduced the power.In this paper we presented the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks(CNNs)on the HSA,on the basis of implementation,we presented tow accelerated methods that are Online update weights and letting CPU to participate in calculation.Experimental results showed that the implementation of CNNs on HSA 4 to 10 times faster than on the CPU.
文摘The brain-inspired spiking neural network (SNN) computing paradigm offers the potential for low-power and scalable computing, suited to many intelligent tasks that conventional computational systems find difficult. On the other hand, NoC (network-on-chips) based very large scale integration (VLSI) systems have been widely used to mimic neuro- biological architectures (including SNNs). This paper proposes an evaluation methodology for SNN applications from the aspect of micro-architecture. First, we extract accurate SNN models from existing simulators of neural systems, second, a cycle-accurate NoC simulator is implemented to execute the aforementioned SNN applications to get timing and energy-consumption information. We believe this method not only benefits the exploration of NoC design space but also bridges the gap between applications (especially those from the neuroscientists' community) and neuromorphic hardware. Based on the method, we have evaluated some typical SNNs in terms of timing and energy. The method is valuable for the development of neuromorphic hardware and applications.
文摘Evolutionary neural network(ENN)shows high performance in function optimization and in finding approximately global optima from searching large and complex spaces.It is one of the most efficient and adaptive optimization techniques used widely to provide candidate solutions that lead to the fitness of the problem.ENN has the extraordinary ability to search the global and learning the approximate optimal solution regardless of the gradient information of the error functions.However,ENN requires high computation and processing which requires parallel processing platforms such as field programmable gate arrays(FPGAs)and graphic processing units(GPUs)to achieve a good performance.This work involves different new implementations of ENN by exploring and adopting different techniques and opportunities for parallel processing.Different versions of ENN algorithm have also been implemented and parallelized on FPGAs platform for low latency by exploiting the parallelism and pipelining approaches.Real data form mass spectrometry data(MSD)application was tested to examine and verify our implementations.This is a very important and extensive computation application which needs to search and find the optimal features(peaks)in MSD in order to distinguish cancer patients from control patients.ENN algorithm is also implemented and parallelized on single core and GPU platforms for comparison purposes.The computation time of our optimized algorithm on FPGA and GPU has been improved by a factor of 6.75 and 6,respectively.