期刊文献+
共找到1,120篇文章
< 1 2 56 >
每页显示 20 50 100
Design space exploration of neural network accelerator based on transfer learning
1
作者 吴豫章 ZHI Tian +1 位作者 SONG Xinkai LI Xi 《High Technology Letters》 EI CAS 2023年第4期416-426,共11页
With the increasing demand of computational power in artificial intelligence(AI)algorithms,dedicated accelerators have become a necessity.However,the complexity of hardware architectures,vast design search space,and c... With the increasing demand of computational power in artificial intelligence(AI)algorithms,dedicated accelerators have become a necessity.However,the complexity of hardware architectures,vast design search space,and complex tasks of accelerators have posed significant challenges.Tra-ditional search methods can become prohibitively slow if the search space continues to be expanded.A design space exploration(DSE)method is proposed based on transfer learning,which reduces the time for repeated training and uses multi-task models for different tasks on the same processor.The proposed method accurately predicts the latency and energy consumption associated with neural net-work accelerator design parameters,enabling faster identification of optimal outcomes compared with traditional methods.And compared with other DSE methods by using multilayer perceptron(MLP),the required training time is shorter.Comparative experiments with other methods demonstrate that the proposed method improves the efficiency of DSE without compromising the accuracy of the re-sults. 展开更多
关键词 design space exploration(DSE) transfer learning neural network accelerator multi-task learning
下载PDF
Effect of different interventions on orthodontic tooth movement acceleration:A network meta-analysis
2
作者 CHEN Hui-ying ZHANG Li +2 位作者 ZHAN Le WAN Ni MO Li-wen 《Journal of Hainan Medical University》 CAS 2024年第2期41-50,共10页
Objective: To explore the effectiveness of various interventions in accelerating tooth movement, a systematic review and net-work meta analysis were used to draw a conclusion. Methods: MEDLINE, EMBASE, Willey Library,... Objective: To explore the effectiveness of various interventions in accelerating tooth movement, a systematic review and net-work meta analysis were used to draw a conclusion. Methods: MEDLINE, EMBASE, Willey Library, EBSCO, Web of Science Databases, and Cochrane Central Register of Controlled Trials databases to identify relevant studies. ADDIS 1.16.6 and Stata 16.0 software were used for NMA. Results: Five thousand five hundred and forty-two articles were searched out. After screening by two independent investigators, forty-seven randomized controlled trials, 1 390 participants, were included in this network meta-analysis. A total of 11 interventions involving Piezocision (Piezo), Photobiomodulation therapy (PBMT), Plate- let-rich plasma(PRP), Electromagnetic field(EF), Low intensity laser therapy(LLLT), Low intensity pulsed ultrasound(LI-PUS), Low-frequency vibrations(LFV), Distraction osteogenesis(DAD), Corticotomy(Corti), Microosteoperforations (MOPS), Traditional orthodontic(OT)were identified and classified into 3 classes including surgical treatment, non-surgical treatment and traditional orthodontic treatment. According to SUCRA probability ranking of the best intervention effect, when orthodontic treatment lasted for 1 month, PBMT (90.6%), Piezo(87.4%) and MOPs(73.6%)were the top three interventions to improve the efficiency of canine tooth movement. When orthodontic treatment lasted for 2 months, Corti (75.7%), Piezo (69.6%) and LFV(58.9%)were the top three interventions for improving the mobility efficiency of canine tooth movement. When orthodontic treatment lasted for 3 months, Cort (73.3%), LLLT(68.4%)and LFV(60.8%)were the top three interventions for improving the mobility efficiency of canine tooth movement. Conclusion: PBMT and Piezo can improve the efficiency of canine tooth movement significantly after 1 month, while Corti and LFV can improve the efficiency of canine tooth movement better after 2 and 3 months. 展开更多
关键词 Orthodontic tooth movement ACCELERATION network Meta-analysis Randomized controlled trials
下载PDF
Design and implementation of dual-mode configurable memory architecture for CNN accelerator
3
作者 山蕊 LI Xiaoshuo +1 位作者 GAO Xu HUO Ziqing 《High Technology Letters》 EI CAS 2024年第2期211-220,共10页
With the rapid development of deep learning algorithms,the computational complexity and functional diversity are increasing rapidly.However,the gap between high computational density and insufficient memory bandwidth ... With the rapid development of deep learning algorithms,the computational complexity and functional diversity are increasing rapidly.However,the gap between high computational density and insufficient memory bandwidth under the traditional von Neumann architecture is getting worse.Analyzing the algorithmic characteristics of convolutional neural network(CNN),it is found that the access characteristics of convolution(CONV)and fully connected(FC)operations are very different.Based on this feature,a dual-mode reronfigurable distributed memory architecture for CNN accelerator is designed.It can be configured in Bank mode or first input first output(FIFO)mode to accommodate the access needs of different operations.At the same time,a programmable memory control unit is designed,which can effectively control the dual-mode configurable distributed memory architecture by using customized special accessing instructions and reduce the data accessing delay.The proposed architecture is verified and tested by parallel implementation of some CNN algorithms.The experimental results show that the peak bandwidth can reach 13.44 GB·s^(-1)at an operating frequency of 120 MHz.This work can achieve 1.40,1.12,2.80 and 4.70 times the peak bandwidth compared with the existing work. 展开更多
关键词 distributed memory structure neural network accelerator reconfigurable arrayprocessor configurable memory structure
下载PDF
基于改进GRU-DNN的收货风险预警模型构建及仿真
4
作者 陈清兵 章光东 +1 位作者 徐康 肖志敏 《微型电脑应用》 2024年第5期132-135,共4页
针对现有收货风险预警方法准确率低的问题,结合Gate Recurrent Unit(GRU)和Deep Neural Networks(DNN),提出一种基于改进GRU-DNN模型的收货风险分析方法。通过采用模糊综合分析法(FSA)筛选出收货风险主要评价指标,并将指标输入通过对抗... 针对现有收货风险预警方法准确率低的问题,结合Gate Recurrent Unit(GRU)和Deep Neural Networks(DNN),提出一种基于改进GRU-DNN模型的收货风险分析方法。通过采用模糊综合分析法(FSA)筛选出收货风险主要评价指标,并将指标输入通过对抗性训练与预测相似性改进的GRU-DNN网络中进行分类识别,实现了收货风险分析。仿真结果表明,所提的改进GRU-DNN风险预警方法可实现收货风险预警,且在准确率、精确率、召回率、F 1各项指标上表现良好,均达到86%以上的有效率,相较于传统基于DNN、Convolutional Neural Network(CNN)和Multivariable Linear Regression Model(MLR)等模型的风险预警方法,具有明显的优势和良好的预测性能和鲁棒性。 展开更多
关键词 风险评价 模糊聚类 GRU模型 dnn网络
下载PDF
DNN在位级可组合架构上的数据流优化方法
5
作者 高汉源 宫磊 王腾 《计算机工程与应用》 CSCD 北大核心 2024年第18期147-157,共11页
位级可组合架构用于支持有多种数据位宽类型的神经网络计算。其硬件结构有较多变体,面对不同神经网络模型需额外设计程序调度。过程耗时,阻碍软硬件的快速迭代和部署,效果难以评估。相关的数据流建模工作缺乏位级计算描述和自动化方法... 位级可组合架构用于支持有多种数据位宽类型的神经网络计算。其硬件结构有较多变体,面对不同神经网络模型需额外设计程序调度。过程耗时,阻碍软硬件的快速迭代和部署,效果难以评估。相关的数据流建模工作缺乏位级计算描述和自动化方法。提出了基于数据流建模的自适应位级可组合架构上的数据调度优化方法解决上述问题。引入位级数据流建模,以多种循环原语和张量-索引关系矩阵,描述位级可组合硬件结构的特征和应用的数据调度过程。从建模表达中提取数据访问信息,统计数据复用情况,进行快速评估。构建了设计空间探索框架,针对不同应用和硬件设计约束自适应优化数据调度过程。利用索引匹配方法和循环变换方法进行设计采样,添加贪心规则进行剪枝,以提高探索效率。在多个应用程序和多种硬件结构约束下进行实验。结果表明对比先进的手动设计的加速器和数据调度,获得了更好的性能表现。 展开更多
关键词 神经网络加速器 可变位宽 数据流 设计空间探索
下载PDF
基于改进DNN网络的电装产线质量预测方法
6
作者 许政 阮西玥 《航空计算技术》 2024年第4期125-129,134,共6页
随着航空机载电装模块功能多元化、尺寸精细化、器件复杂化程度的不断提升,对SMT产线电装质量提出了新的挑战,目前SMT产线生产过程普遍存在产品质检数据关联性较差,产品质量分析滞后及预测性差等问题,而传统的统计分析方法无法有效提取... 随着航空机载电装模块功能多元化、尺寸精细化、器件复杂化程度的不断提升,对SMT产线电装质量提出了新的挑战,目前SMT产线生产过程普遍存在产品质检数据关联性较差,产品质量分析滞后及预测性差等问题,而传统的统计分析方法无法有效提取海量无序数据中的知识和规律,提出一种基于深度学习的电装质量预测方法。首先,构建质量评价方法,确定电装质量的影响因素;其次,利用主成分分析法(Principal Components Analysis,PCA)对质量数据进行预处理,剔除非相关特征;然后,引入DNN网络,构建电装质量预测模型,利用BFO-PSO优化算法搜寻网络的最优隐含层层数及节点数;最后,通过航空电装产线实际制造数据进行仿真测试,验证了所提出方法的有效性和科学性。 展开更多
关键词 SMT产线 电装质量预测 BFO-PSO优化算法 dnn网络 智能制造
下载PDF
基于多DNN的5G双域专网模式的研究与应用
7
作者 饶亮 《长江信息通信》 2024年第5期172-174,共3页
随着5G ToB网络的不断演进,基于ULCL(上行分流器)的双域专网越来越被市场特别是高校类客户所接受,其最大的亮点是数据不出园区,用户不用更换终端和手机卡便可以同时使用公网和内网,兼顾了内网数据保密性和使用公网的便捷性,但是目前大... 随着5G ToB网络的不断演进,基于ULCL(上行分流器)的双域专网越来越被市场特别是高校类客户所接受,其最大的亮点是数据不出园区,用户不用更换终端和手机卡便可以同时使用公网和内网,兼顾了内网数据保密性和使用公网的便捷性,但是目前大部分的基于ULCL双域专网方案是基于单DNN的模式,漫游场景下时延较高,时延敏感型体验较差;核心网业务数据配置复杂。对此,文章介绍了一种基于多DNN的双域专网模式,进一步提升用户的感知,并通过实际案例验证了方案的可行性。 展开更多
关键词 dnn ULCL 双域专网
下载PDF
A Survey of Accelerator Architectures for Deep Neural Networks 被引量:6
8
作者 Yiran Chen Yuan Xie +2 位作者 Linghao Song Fan Chen Tianqi Tang 《Engineering》 SCIE EI 2020年第3期264-274,共11页
Recently,due to the availability of big data and the rapid growth of computing power,artificial intelligence(AI)has regained tremendous attention and investment.Machine learning(ML)approaches have been successfully ap... Recently,due to the availability of big data and the rapid growth of computing power,artificial intelligence(AI)has regained tremendous attention and investment.Machine learning(ML)approaches have been successfully applied to solve many problems in academia and in industry.Although the explosion of big data applications is driving the development of ML,it also imposes severe challenges of data processing speed and scalability on conventional computer systems.Computing platforms that are dedicatedly designed for AI applications have been considered,ranging from a complement to von Neumann platforms to a“must-have”and stand-alone technical solution.These platforms,which belong to a larger category named“domain-specific computing,”focus on specific customization for AI.In this article,we focus on summarizing the recent advances in accelerator designs for deep neural networks(DNNs)-that is,DNN accelerators.We discuss various architectures that support DNN executions in terms of computing units,dataflow optimization,targeted network topologies,architectures on emerging technologies,and accelerators for emerging applications.We also provide our visions on the future trend of AI chip designs. 展开更多
关键词 Deep neural network Domain-specific architecture accelerator
下载PDF
FPGA implementation of neural network accelerator for pulse information extraction in high energy physics 被引量:2
9
作者 Jun-Ling Chen Peng-Cheng Ai +5 位作者 Dong Wang Hui Wang Ni Fang De-Li Xu Qi Gong Yuan-Kang Yang 《Nuclear Science and Techniques》 SCIE CAS CSCD 2020年第5期27-35,共9页
Extracting the amplitude and time information from the shaped pulse is an important step in nuclear physics experiments.For this purpose,a neural network can be an alternative in off-line data processing.For processin... Extracting the amplitude and time information from the shaped pulse is an important step in nuclear physics experiments.For this purpose,a neural network can be an alternative in off-line data processing.For processing the data in real time and reducing the off-line data storage required in a trigger event,we designed a customized neural network accelerator on a field programmable gate array platform to implement specific layers in a convolutional neural network.The latter is then used in the front-end electronics of the detector.With fully reconfigurable hardware,a tested neural network structure was used for accurate timing of shaped pulses common in front-end electronics.This design can handle up to four channels of pulse signals at once.The peak performance of each channel is 1.665 Giga operations per second at a working frequency of 25 MHz. 展开更多
关键词 Convolutional neural networks PULSE SHAPING ACCELERATION FRONT-END ELECTRONICS
下载PDF
A survey of neural network accelerator with software development environments
10
作者 Jin Song Xuemeng Wang +2 位作者 Zhipeng Zhao Wei Li Tian Zhi 《Journal of Semiconductors》 EI CAS CSCD 2020年第2期20-28,共9页
Recent years,the deep learning algorithm has been widely deployed from cloud servers to terminal units.And researchers proposed various neural network accelerators and software development environments.In this article... Recent years,the deep learning algorithm has been widely deployed from cloud servers to terminal units.And researchers proposed various neural network accelerators and software development environments.In this article,we have reviewed the representative neural network accelerators.As an entirety,the corresponding software stack must consider the hardware architecture of the specific accelerator to enhance the end-to-end performance.And we summarize the programming environments of neural network accelerators and optimizations in software stack.Finally,we comment the future trend of neural network accelerator and programming environments. 展开更多
关键词 neural network accelerator compiling optimization programming environments
下载PDF
FPGA-based acceleration for binary neural networks in edge computing 被引量:1
11
作者 Jin-Yu Zhan An-Tai Yu +4 位作者 Wei Jiang Yong-Jia Yang Xiao-Na Xie Zheng-Wei Chang Jun-Huan Yang 《Journal of Electronic Science and Technology》 EI CAS CSCD 2023年第2期65-77,共13页
As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories ... As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work. 展开更多
关键词 accelerator BINARIZATION Field-programmable gate array(FPGA) Neural networks Quantification
下载PDF
Design and Optimization of Winograd Convolution on Array Accelerator 被引量:1
12
作者 Ji Lai Lixin Yang +4 位作者 Dejian Li Chongfei Shen Xi Feng Jizeng Wei Yu Liu 《Journal of Beijing Institute of Technology》 EI CAS 2023年第1期69-81,共13页
With the rapid development and popularization of artificial intelligence technology,convolutional neural network(CNN)is applied in many fields,and begins to replace most traditional algorithms and gradually deploys to... With the rapid development and popularization of artificial intelligence technology,convolutional neural network(CNN)is applied in many fields,and begins to replace most traditional algorithms and gradually deploys to terminal devices.However,the huge data movement and computational complexity of CNN bring huge power consumption and performance challenges to the hardware,which hinders the application of CNN in embedded devices such as smartphones and smart cars.This paper implements a convolutional neural network accelerator based on Winograd convolution algorithm on field-programmable gate array(FPGA).Firstly,a convolution kernel decomposition method for Winograd convolution is proposed.The convolution kernel larger than 3×3 is divided into multiple 3×3 convolution kernels for convolution operation,and the unsynchronized long convolution operation is processed.Then,we design Winograd convolution array and use configurable multiplier to flexibly realize multiplication for data with different accuracy.Experimental results on VGG16 and AlexNet network show that our accelerator has the most energy efficient and 101 times that of the CPU,5.8 times that of the GPU.At the same time,it has higher energy efficiency than other convolutional neural network accelerators. 展开更多
关键词 convolutional neural network Winograd convolution algorithm accelerator
下载PDF
基于DNN整机建模的滚珠丝杠进给系统关键结合部动态特性参数辨识
13
作者 朱迪 张玮 +1 位作者 黄之文 朱坚民 《振动与冲击》 EI CSCD 北大核心 2023年第3期243-254,279,共13页
针对滚珠丝杠进给系统关键结合部动态特性参数的辨识精度不高等问题。提出利用可表征结合部动态特性参数与整机固有频率之间映射关系的深度神经网络(deep neural network, DNN)建立进给系统整机的等效动力学模型;结合进给系统固有频率的... 针对滚珠丝杠进给系统关键结合部动态特性参数的辨识精度不高等问题。提出利用可表征结合部动态特性参数与整机固有频率之间映射关系的深度神经网络(deep neural network, DNN)建立进给系统整机的等效动力学模型;结合进给系统固有频率的DNN预测值与实验模态分析值,采用粒子群优化(particle swarm optimization, PSO)算法对进给系统关键结合部的不同方向的刚度、阻尼参数同时辨识。以自行设计制造的进给系统实验台为实例进行整机建模、实验、参数辨识等分析;最终的辨识结果达到很高精度,说明该方法是可行、有效的。 展开更多
关键词 滚珠丝杠进给系统 关键结合部 动态特性参数 深度神经网络(dnn) 整机建模
下载PDF
FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy 被引量:1
14
作者 Tuo Ma Zhiwei Li +3 位作者 Qingjiang Li Haijun Liu Zhongjin Zhao Yinan Wang 《Computers, Materials & Continua》 SCIE EI 2023年第12期3237-3263,共27页
With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware ... With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware deployment platforms,Field Programmable Gate Array(FPGA)has the advantages of being programmable,low power consumption,parallelism,and low cost.However,the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator.The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing,but this method’s data multiplexing rate is low because it repeatedly reads the data between rows.This paper proposes a fast data readout strategy via the circular sliding window data reading method,it can improve the multiplexing rate of data between rows by optimizing the memory access order of input data.In addition,the multiplication bit width of the DCNN accelerator is much smaller than that of the Digital Signal Processing(DSP)on the FPGA,which means that there will be a waste of resources if a multiplication uses a single DSP.A multiplier sharing strategy is proposed,the multiplier of the accelerator is customized so that a single DSP block can complete multiple groups of 4,6,and 8-bit signed multiplication in parallel.Finally,based on two strategies of appeal,an FPGA optimized accelerator is proposed.The accelerator is customized by Verilog language and deployed on Xilinx VCU118.When the accelerator recognizes the CIRFAR-10 dataset,its energy efficiency is 39.98 GOPS/W,which provides 1.73×speedup energy efficiency over previous DCNN FPGA accelerators.When the accelerator recognizes the IMAGENET dataset,its energy efficiency is 41.12 GOPS/W,which shows 1.28×−3.14×energy efficiency compared with others. 展开更多
关键词 FPGA accelerator DCNN fast data readout strategy multiplier sharing strategy network quantization energy efficient
下载PDF
LACC:a hardware and software co-design accelerator for deep neural networks
15
作者 Yu Yong Zhi Tian Zhou Shengyuan 《High Technology Letters》 EI CAS 2021年第1期62-67,共6页
With the increasing of data size and model size,deep neural networks(DNNs)show outstanding performance in many artificial intelligence(AI)applications.But the big model size makes it a challenge for high-performance a... With the increasing of data size and model size,deep neural networks(DNNs)show outstanding performance in many artificial intelligence(AI)applications.But the big model size makes it a challenge for high-performance and low-power running DNN on processors,such as central processing unit(CPU),graphics processing unit(GPU),and tensor processing unit(TPU).This paper proposes a LOGNN data representation of 8 bits and a hardware and software co-design deep neural network accelerator LACC to meet the challenge.LOGNN data representation replaces multiply operations to add and shift operations in running DNN.LACC accelerator achieves higher efficiency than the state-of-the-art DNN accelerators by domain specific arithmetic computing units.Finally,LACC speeds up the performance per watt by 1.5 times,compared to the state-of-the-art DNN accelerators on average. 展开更多
关键词 deep neural network(dnn) domain specific accelerator domain specific data type
下载PDF
An FPGA-Based Resource-Saving Hardware Accelerator for Deep Neural Network
16
作者 Han Jia Xuecheng Zou 《International Journal of Intelligence Science》 2021年第2期57-69,共13页
With the development of computer vision researches, due to the state-of-the-art performance on image and video processing tasks, deep neural network (DNN) has been widely applied in various applications (autonomous ve... With the development of computer vision researches, due to the state-of-the-art performance on image and video processing tasks, deep neural network (DNN) has been widely applied in various applications (autonomous vehicles, weather forecasting, counter-terrorism, surveillance, traffic management, etc.). However, to achieve such performance, DNN models have become increasingly complicated and deeper, and result in heavy computational stress. Thus, it is not sufficient for the general central processing unit (CPU) processors to meet the real-time application requirements. To deal with this bottleneck, research based on hardware acceleration solution for DNN attracts great attention. Specifically, to meet various real-life applications, DNN acceleration solutions mainly focus on issue of hardware acceleration with intense memory and calculation resource. In this paper, a novel resource-saving architecture based on Field Programmable Gate Array (FPGA) is proposed. Due to the novel designed processing element (PE), the proposed architecture </span><span style="font-family:Verdana;">achieves good performance with the extremely limited calculating resource. The on-chip buffer allocation helps enhance resource-saving performance on memory. Moreover, the accelerator improves its performance by exploiting</span> <span style="font-family:Verdana;">the sparsity property of the input feature map. Compared to other state-of-the-art</span><span style="font-family:Verdana;"> solutions based on FPGA, our architecture achieves good performance, with quite limited resource consumption, thus fully meet the requirement of real-time applications. 展开更多
关键词 Deep Neural network RESOURCE-SAVING Hardware accelerator Data Flow
下载PDF
Hyperparameter Tuning for Deep Neural Networks Based Optimization Algorithm 被引量:3
17
作者 D.Vidyabharathi V.Mohanraj 《Intelligent Automation & Soft Computing》 SCIE 2023年第6期2559-2573,共15页
For training the present Neural Network(NN)models,the standard technique is to utilize decaying Learning Rates(LR).While the majority of these techniques commence with a large LR,they will decay multiple times over ti... For training the present Neural Network(NN)models,the standard technique is to utilize decaying Learning Rates(LR).While the majority of these techniques commence with a large LR,they will decay multiple times over time.Decaying has been proved to enhance generalization as well as optimization.Other parameters,such as the network’s size,the number of hidden layers,drop-outs to avoid overfitting,batch size,and so on,are solely based on heuristics.This work has proposed Adaptive Teaching Learning Based(ATLB)Heuristic to identify the optimal hyperparameters for diverse networks.Here we consider three architec-tures Recurrent Neural Networks(RNN),Long Short Term Memory(LSTM),Bidirectional Long Short Term Memory(BiLSTM)of Deep Neural Networks for classification.The evaluation of the proposed ATLB is done through the various learning rate schedulers Cyclical Learning Rate(CLR),Hyperbolic Tangent Decay(HTD),and Toggle between Hyperbolic Tangent Decay and Triangular mode with Restarts(T-HTR)techniques.Experimental results have shown the performance improvement on the 20Newsgroup,Reuters Newswire and IMDB dataset. 展开更多
关键词 Deep learning deep neural network(dnn) learning rates(LR) recurrent neural network(RNN) cyclical learning rate(CLR) hyperbolic tangent decay(HTD) toggle between hyperbolic tangent decay and triangular mode with restarts(T-HTR) teaching learning based optimization(TLBO)
下载PDF
基于无监督DNN和Sub-6GHz的毫米波功率控制算法
18
作者 孙长印 毛亚宁 +1 位作者 江帆 王军选 《西安邮电大学学报》 2023年第2期29-38,共10页
针对毫米波(millimeter Wave,mmW)系统中有监督深度神经网络(Deep Neural Networks,DNN)功率控制算法的性能受限及mmW信道信息测量质量不佳的问题,提出一种基于无监督DNN(Unsupervised DNN,UDNN)和Sub-6GHz频段的mmW功率控制算法。将最... 针对毫米波(millimeter Wave,mmW)系统中有监督深度神经网络(Deep Neural Networks,DNN)功率控制算法的性能受限及mmW信道信息测量质量不佳的问题,提出一种基于无监督DNN(Unsupervised DNN,UDNN)和Sub-6GHz频段的mmW功率控制算法。将最大化系统和速率设置为UDNN的损失函数,改进有监督学习的性能上限并利用集成学习进一步提升算法性能,最终实现输入Sub-6GHz频段信道信息即可得到mmW频段最优功率分配。为了验证所提算法的可行性,将所提算法与加权最小均方误差算法、最大功率控制算法、随机功率控制算法和二进制穷举搜索算法的系统和速率进行对比。验证结果表明,所提算法的和速率性能分别是其他4种算法的1.121倍、2.322倍、1.843倍和1.022倍,且采用Sub-6GHz预测mmW功率控制可全程逼近使用mmW信道预测功率控制的性能。 展开更多
关键词 毫米波通信 无监督dnn 深度神经网络 功率分配 系统和速率 集成学习
下载PDF
Optimizing deep learning inference on mobile devices with neural network accelerators
19
作者 Zeng Xi Xu Yunlong Zhi Tian 《High Technology Letters》 EI CAS 2019年第4期417-425,共9页
Deep learning has now been widely used in intelligent apps of mobile devices.In pursuit of ultra-low power and latency,integrating neural network accelerators(NNA)to mobile phones has become a trend.However,convention... Deep learning has now been widely used in intelligent apps of mobile devices.In pursuit of ultra-low power and latency,integrating neural network accelerators(NNA)to mobile phones has become a trend.However,conventional deep learning programming frameworks are not well-developed to support such devices,leading to low computing efficiency and high memory-occupation.To address this problem,a 2-stage pipeline is proposed for optimizing deep learning model inference on mobile devices with NNAs in terms of both speed and memory-footprint.The 1 st stage reduces computation workload via graph optimization,including splitting and merging nodes.The 2 nd stage goes further by optimizing at compilation level,including kernel fusion and in-advance compilation.The proposed optimizations on a commercial mobile phone with an NNA is evaluated.The experimental results show that the proposed approaches achieve 2.8×to 26×speed up,and reduce the memory-footprint by up to 75%. 展开更多
关键词 machine learning inference neural network accelerator(NNA) low latency kernel fusion in-advance compilation
下载PDF
基于DNN-HMM的蒙古语声学模型结构实验研究 被引量:1
20
作者 李晋益 马志强 +2 位作者 刘志强 朱方圆 王洪彬 《中文信息学报》 CSCD 北大核心 2023年第8期52-65,共14页
DNN-HMM作为语音识别中的一种混合建模技术,由深度神经网络和隐马尔可夫模型组成。在使用蒙古语语料库构建DNN-HMM声学模型的过程中,为了研究DNN-HMM结构对蒙古语声学建模的影响以及蒙古语语料库规模与DNN-HMM声学模型结构的关系,通过设... DNN-HMM作为语音识别中的一种混合建模技术,由深度神经网络和隐马尔可夫模型组成。在使用蒙古语语料库构建DNN-HMM声学模型的过程中,为了研究DNN-HMM结构对蒙古语声学建模的影响以及蒙古语语料库规模与DNN-HMM声学模型结构的关系,通过设计DNN-HMM声学模型中DNN的结构,该文提出Rectangle DNN-HMM、Trapezoid DNN-HMM、Polygon DNN-HMM和Hourglass DNN-HMM四种结构的DNNHMM声学模型,并以Kaldi实验平台为基础进行实验,选取音素作为建模单元,使用三种规模的蒙古语语料库分别构建四种结构的DNN-HMM声学模型。深度结构和宽度结构实验结果表明,深度为6层的Polygon DNNHMM结构适合蒙古语声学模型建模;随着语料库规模的增大,通过适当增加声学模型的宽度,可以使声学模型的每一层都能学习到更丰富的语音特征,提高语音识别的准确率。 展开更多
关键词 dnn-HMM 声学模型 深度神经网络 蒙古语声学模型
下载PDF
上一页 1 2 56 下一页 到第
使用帮助 返回顶部