期刊文献+
共找到398篇文章
< 1 2 20 >
每页显示 20 50 100
一款基于新型Field Programmable Gate Array芯片的投影仪梯形校正系统研究与实现 被引量:5
1
作者 曹凤莲 沈庆宏 +1 位作者 盛任农 高敦堂 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2006年第4期362-367,共6页
投影设备配备的梯形校正普遍存在校正范围小,画面的一些线条和字符边缘会出现毛刺和不平滑现象,矫正效果不理想.如果采用通用的图像处理芯片和复杂的算法,可以解决上述问题,但又会导致成本急剧上升.为了解决上述矛盾,提出一种基于FPGA(F... 投影设备配备的梯形校正普遍存在校正范围小,画面的一些线条和字符边缘会出现毛刺和不平滑现象,矫正效果不理想.如果采用通用的图像处理芯片和复杂的算法,可以解决上述问题,但又会导致成本急剧上升.为了解决上述矛盾,提出一种基于FPGA(Field Programmable Gate Array)芯片的新型梯形校正实现方案,解决了校正范围与锯齿失真的矛盾问题,并为进一步成为芯片级产品铺平了道路.图像处理采用kaiser窗函数和sinc函数相结合的方法进行插值,这样的滤波器改善了旁瓣抑制,具有较好的通带性能.介绍了梯形失真的产生和校正原理,提出了利用FPGA芯片XC3S400作为核心图像处理单元的梯形校正系统的硬件和软件实现,说明了该芯片结构、功能及特性,最后提供了校正的效果图. 展开更多
关键词 图像处理 梯形校正 field programmable gate array 锯齿失真
下载PDF
A New Design Method for Variable Digital Filter Based on Field Programmable Gate Array(FPGA) 被引量:2
2
作者 胡文静 仇润鹤 李外云 《Journal of Donghua University(English Edition)》 EI CAS 2012年第2期193-196,共4页
In order to obtain variable characteristics,the digital filter's type,number of taps and coefficients should be changed constantly such that the desired frequency-domain characteristics can be obtained.This paper ... In order to obtain variable characteristics,the digital filter's type,number of taps and coefficients should be changed constantly such that the desired frequency-domain characteristics can be obtained.This paper proposes a method for self-programmable variable digital filter(VDF) design based on field programmable gate array(FPGA).We implement a digital filter system by using custom embedded micro-processor,programmable finite impulse response(P-FIR) macro module,coefficient-loader,clock manager and analog/digital(A/D) or digital/analog(D/A) controller and other modules.The self-programmable VDF can provide the best solution for realization of digital filter algorithms,which are the low-pass,high-pass,band-pass and band-stop filter algorithms with variable frequency domain characteristics.The design examples with minimum 1 to maximum 32 taps FIR filter,based on Modelsim post-routed simulation and onboard running on XUPV5-LX110T,are provided to demonstrate the effectiveness of the proposed method. 展开更多
关键词 variable digital filter(VDF) field programmable gate array(FPGA) embedded micro-processor(EMP)
下载PDF
A novel fuzzy logic direct torque controller for a permanent magnet synchronous motor with a field programmable gate array 被引量:1
3
作者 陈永军 《Journal of Chongqing University》 CAS 2008年第3期228-233,共6页
A high-performance digital servo system built on the platform of a field programmable gate array (FPGA),a fully digitized hardware design scheme of a direct torque control (DTC) and a low speed permanent magnet synchr... A high-performance digital servo system built on the platform of a field programmable gate array (FPGA),a fully digitized hardware design scheme of a direct torque control (DTC) and a low speed permanent magnet synchronous motor (PMSM) is proposed. The DTC strategy of PMSM is described with Verilog hardware description language and is employed on-chip FPGA in accordance with the electronic design automation design methodology. Due to large torque ripples in low speed PMSM,the hysteresis controller in a conventional PMSM DTC was replaced by a fuzzy controller. This FPGA scheme integrates the direct torque controller strategy,the time speed measurement algorithm,the fuzzy regulating technique and the space vector pulse width modulation principle. Experimental results indicate the fuzzy controller can provide a controllable speed at 20 r min-1 and torque at 330 N m with satisfactory dynamic and static performance. Furthermore,the results show that this new control strategy decreases the torque ripple drastically and enhances control performance. 展开更多
关键词 fuzzy control direct torque control field programmable gate array permanent magnet synchronous motor
下载PDF
Synthesis of Nonlinear Control of Switching Topologies of Buck-Boost Converter Using Fuzzy Logic on Field Programmable Gate Array (FPGA) 被引量:1
4
作者 Johnson A. Asumadu Vaidhyanathan Jagannathan Arkhom Chachavalnanont 《Journal of Intelligent Learning Systems and Applications》 2010年第1期36-42,共7页
An intelligent fuzzy logic inference pipeline for the control of a dc-dc buck-boost converter was designed and built using a semi-custom VLSI chip. The fuzzy linguistics describing the switching topologies of the conv... An intelligent fuzzy logic inference pipeline for the control of a dc-dc buck-boost converter was designed and built using a semi-custom VLSI chip. The fuzzy linguistics describing the switching topologies of the converter was mapped into a look-up table that was synthesized into a set of Boolean equations. A VLSI chip–a field programmable gate array (FPGA) was used to implement the Boolean equations. Features include the size of RAM chip independent of number of rules in the knowledge base, on-chip fuzzification and defuzzification, faster response with speeds over giga fuzzy logic inferences per sec (FLIPS), and an inexpensive VLSI chip. The key application areas are: 1) on-chip integrated controllers;and 2) on-chip co-integration for entire system of sensors, circuits, controllers, and detectors for building complete instrument systems. 展开更多
关键词 Multi-Fuzzy Logic Controller (MFLC) field programmable gate array (FPGA) BUCK-BOOST Converter BOOLEAN Look-Up TABLE CO-INTEGRATION
下载PDF
Fault Prediction and Diagnosis of Warship Equipment Field Programmable Gate Array Software
5
作者 LIU Bojiang YAN Ran +2 位作者 CHAI Haiyan HAN Xinyu TANG Longli 《Journal of Donghua University(English Edition)》 EI CAS 2018年第5期426-429,共4页
In order to solve the current high failure rate of warship equipment field programmable gate array( FPGA) software,fault detection is not timely enough and FPGA detection equipment is expensive and so on. After in-dep... In order to solve the current high failure rate of warship equipment field programmable gate array( FPGA) software,fault detection is not timely enough and FPGA detection equipment is expensive and so on. After in-depth research,this paper proposes a warship equipment FPGA software based on Xilinx integrated development environment( ISE) and ModelSim software.Functional simulation and timing simulation to verify the correctness of the logic design of the FPGA,this method is very convenient to view the signal waveform inside the FPGA program to help FPGA test engineers to achieve FPGA fault prediction and diagnosis. This test method has important engineering significance for the upgrading of warship equipment. 展开更多
关键词 field programmable gate array(FPGA) FAULT prediction DIAGNOSIS
下载PDF
Implementation of Synchronization Technology in Orthogonal Frequency Division Multiplex System Based on Field Programming Gate Array
6
作者 YI Qing-ming XIE Sheng-li 《Semiconductor Photonics and Technology》 CAS 2008年第1期32-36,共5页
In this paper,analyzed is the symbol synchronization algorithm in orthogonal frequency division multiplex(OFDM)system,and accomplished are the hardware circuit design of coarse and elaborate synchronization algorithms... In this paper,analyzed is the symbol synchronization algorithm in orthogonal frequency division multiplex(OFDM)system,and accomplished are the hardware circuit design of coarse and elaborate synchronization algorithms.Based on the analysis of coarse and elaborate synchronization algorithms,multiplexed are,the module accumulator,division and output judgement,which can evidently save the hardware resource cost.The analysis of circuit sequence and wave form simulation of the design scheme shows that the proposed method efficiently reduce system resources and power consumption. 展开更多
关键词 orthogonal frequency division multiplex timing synchronization field programmable gate array
下载PDF
Implimentations of SIMD machine using programmable gate array
7
作者 胡铭曾 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2000年第3期10-13,共4页
Field Programmable Gate Array(FPGA) and Single Instruction Multiple Data(SIMD) processing array share many architecture features. In both architectures, an array is employed to provide high speed computation. In this ... Field Programmable Gate Array(FPGA) and Single Instruction Multiple Data(SIMD) processing array share many architecture features. In both architectures, an array is employed to provide high speed computation. In this paper we show that the implementation of a Single Instruction Multiple Data (SIMD) machine the ABC 90 using the Field Programmable Gate Array (FPGA) is not completely suitable because of its characteristics. The comparison between the programmable gate arrays show that, they have many architectures features in common. Within this framework, we examine the differences and similarities between these array structures and touch upon techniques and lessons which can be done between these architectures in order to choose the appropriate Programmable gate array to implement a general purpose parallel computer. In this paper we introduce the principal of the Dynamically Programmable Date Array(DPGA) which combines the best feature of the FPGA and the SIMD arrays into a single array architecture. By the same way we show that the DPGA is more appropriate then the FPGA for wiring, hardwiring the general purpose parallel computers: SIMD and its implementation. 展开更多
关键词 field programmable gate array Single INSTRUCTION Multiple DATA Dynamically programmable DATA array
下载PDF
MIXED-GRAINED CMOS FIELD PROGRAMMABLE ANALOG ARRAY FOR SMART SENSORY APPLICATIONS
8
作者 Cheng Xiaoyan Yang Haigang +3 位作者 Yin Tao Wu Qisong Zhi Tian Liu Fei 《Journal of Electronics(China)》 2014年第2期129-142,共14页
The drive towards shorter design cycles for analog integrated circuits has given impetus to the development of Field Programmable Analog Arrays(FPAAs),which are the analogue counterparts of Field Programmable Gate Arr... The drive towards shorter design cycles for analog integrated circuits has given impetus to the development of Field Programmable Analog Arrays(FPAAs),which are the analogue counterparts of Field Programmable Gate Arrays(FPGAs).In this paper,we present a new design methodology which using FPAA as a powerful analog front-end processing platform in the smart sensory microsystem.The proposed FPAA contains 16 homogeneous mixed-grained Configurable Analog Blocks(CABs) which house a variety of processing elements especially the proposed fine-grained Core Configurable Amplifiers(CCAs).The high flexible CABs allow the FPAA operating in both continuous-time and discrete-time approaches suitable to support variety of sensors.To reduce the nonideal parasitic effects and save area,the fat-tree interconnection network is adopted in this FPAA.The functionality of this FPAA is demonstrated through embedding of voltage and capacitive sensor signal readout circuits and a configurable band pass filter.The minimal detectable voltage and capacitor achieves 38 uV and 8.3 aF respectively within 100 Hz sensor bandwidth.The power consumption comparison of CCA in three applications shows that the FPAA has high power efficiency.And the simulation results also show that the FPAA has good tolerance with wide PVT variations. 展开更多
关键词 field programmable gate array(FPGA) field programmable Analog array(FPAA) Sensor Mixed-grained Configurable Analog Block(CAB) Correlated Double Sampling(CDS)
下载PDF
时空图卷积网络的骨架识别硬件加速器设计
9
作者 谭会生 严舒琪 杨威 《电子测量技术》 北大核心 2024年第11期36-43,共8页
随着人工智能技术的不断发展,神经网络的数据规模逐渐扩大,神经网络的计算量也迅速攀升。为了减少时空图卷积神经网络的计算量,降低硬件实现的资源消耗,提升人体骨架识别时空图卷积神经网络(ST-GCN)实际应用系统的处理速度,利用现场可... 随着人工智能技术的不断发展,神经网络的数据规模逐渐扩大,神经网络的计算量也迅速攀升。为了减少时空图卷积神经网络的计算量,降低硬件实现的资源消耗,提升人体骨架识别时空图卷积神经网络(ST-GCN)实际应用系统的处理速度,利用现场可编程门阵列(FPGA),设计开发了一个基于时空图卷积神经网络的骨架识别硬件加速器。通过对原网络模型进行结构优化与数据量化,减少了FPGA实现约75%的计算量;利用邻接矩阵稀疏性的特点,提出了一种稀疏性矩阵乘加运算的优化方法,减少了约60%的乘法器资源消耗。经过对人体骨架识别实验验证,结果表明,在时钟频率100 MHz下,相较于CPU,FPGA加速ST-GCN单元,加速比达到30.53;FPGA加速人体骨架识别,加速比达到6.86。 展开更多
关键词 人体骨架识别 时空图卷积神经网络(ST-GCN) 硬件加速器 现场可编程门阵列(FPGA) 稀疏矩阵乘加运算硬件优化
下载PDF
基于异构平台的卷积神经网络加速系统设计 被引量:2
10
作者 秦文强 吴仲城 +1 位作者 张俊 李芳 《计算机工程与科学》 CSCD 北大核心 2024年第1期12-20,共9页
在计算和存储资源受限的嵌入式设备上部署卷积神经网络,存在执行速度慢、计算效率低、功耗高的问题。提出了一种基于异构平台的新型卷积神经网络加速架构,设计并实现了基于MobileNet的轻量化卷积神经网络加速系统。首先,为降低硬件资源... 在计算和存储资源受限的嵌入式设备上部署卷积神经网络,存在执行速度慢、计算效率低、功耗高的问题。提出了一种基于异构平台的新型卷积神经网络加速架构,设计并实现了基于MobileNet的轻量化卷积神经网络加速系统。首先,为降低硬件资源消耗以及数据传输成本,采用动态定点数量化和批标准化融合的设计方法,对网络模型进行了优化,并降低了加速系统的硬件设计复杂度;其次,通过实现卷积分块、并行卷积计算、数据流优化,有效提高了卷积运算效率和系统吞吐率。在PYNQ-Z2平台上的实验结果表明,此加速系统实现的MobileNet网络推理加速方案对单幅图像的识别时间为0.18 s,系统功耗为2.62 W,相较于ARM单核处理器加速效果提升了128倍。 展开更多
关键词 现场可编程门阵列(FPGA) Vivado高层次综合 卷积神经网络 异构平台 硬件加速
下载PDF
太阳耀光偏振参数星上计算系统设计
11
作者 李宇豪 纪峰 +6 位作者 裘桢炜 陈斐楠 厉卓然 池杲鋆 陈晶晶 胡亚东 李孟凡 《光电工程》 CAS CSCD 北大核心 2024年第4期52-62,共11页
太阳耀光影响光学遥感成像质量,可采用在探测器前加装偏振片的方式进行抑制,抑制效果取决于太阳和遥感器的相对位置以及偏振片的偏振方向。为实时准确地获取太阳耀光偏振信息,本文在星载大气校正仪上设计了一套星上太阳耀光偏振参数计... 太阳耀光影响光学遥感成像质量,可采用在探测器前加装偏振片的方式进行抑制,抑制效果取决于太阳和遥感器的相对位置以及偏振片的偏振方向。为实时准确地获取太阳耀光偏振信息,本文在星载大气校正仪上设计了一套星上太阳耀光偏振参数计算系统,利用大气校正仪670 nm波段的0°,60°,120°三个通道偏振图像实时计算耀光参数,并使用基于6S大气辐射传输模型的耀光参数建立晴空海洋偏振双向反射分布函数查找表,排除受云干扰的图像像元,最后利用实时探测数据进行高精度太阳耀光偏振方位角计算。系统以V5系列的现场可编程门阵列为计算平台,使用高层次综合工具进行算法的硬件实现,并在实验室内进行了实验验证。实验结果表明,系统计算偏振角误差与真实值相比在0.5°以内,在100 MHz主频时钟下一组25×25像元的数据计算时间消耗为19.47281 ms,FPGA资源使用率为41%。 展开更多
关键词 太阳耀光 现场可编程门阵列 偏振测量 硬件加速
下载PDF
Implementation of Dynamic Matrix Control on Field Programmable Gate Array
12
作者 兰建 李德伟 +1 位作者 杨楠 席裕庚 《Journal of Shanghai Jiaotong university(Science)》 EI 2011年第4期441-446,共6页
High performance computer is often required by model predictive control(MPC) systems due to the heavy online computation burden.To extend MPC to more application cases with low-cost computation facilities, the impleme... High performance computer is often required by model predictive control(MPC) systems due to the heavy online computation burden.To extend MPC to more application cases with low-cost computation facilities, the implementation of MPC controller on field programmable gate array(FPGA) system is studied.For the dynamic matrix control(DMC) algorithm,the main design idea and the implemental strategy of DMC controller are introduced based on a FPGA’s embedded system.The performance tests show that both the computation efficiency and the accuracy of the proposed controller can be satisfied due to the parallel computing capability of FPGA. 展开更多
关键词 model predictive control(MPC) dynamic matrix control(DMC) quadratic programming(QP) active set programmable logic device field programmable gate array(FPGA)
原文传递
基于FPGA的卷积神经网络和视觉Transformer通用加速器
13
作者 李天阳 张帆 +2 位作者 王松 曹伟 陈立 《电子与信息学报》 EI CAS CSCD 北大核心 2024年第6期2663-2672,共10页
针对计算机视觉领域中基于现场可编程逻辑门阵列(FPGA)的传统卷积神经网(CNN)络加速器不适配视觉Transformer网络的问题,该文提出一种面向卷积神经网络和Transformer的通用FPGA加速器。首先,根据卷积和注意力机制的计算特征,提出一种面... 针对计算机视觉领域中基于现场可编程逻辑门阵列(FPGA)的传统卷积神经网(CNN)络加速器不适配视觉Transformer网络的问题,该文提出一种面向卷积神经网络和Transformer的通用FPGA加速器。首先,根据卷积和注意力机制的计算特征,提出一种面向FPGA的通用计算映射方法;其次,提出一种非线性与归一化加速单元,为计算机视觉神经网络模型中的多种非线性和归一化操作提供加速支持;然后,在Xilinx XCVU37P FPGA上实现了加速器设计。实验结果表明,所提出的非线性与归一化加速单元在提高吞吐量的同时仅造成很小的精度损失,ResNet-50和ViT-B/16在所提FPGA加速器上的性能分别达到了589.94 GOPS和564.76 GOPS。与GPU实现相比,能效比分别提高了5.19倍和7.17倍;与其他基于FPGA的大规模加速器设计相比,能效比有明显提高,同时计算效率较对比FPGA加速器提高了8.02%~177.53%。 展开更多
关键词 计算机视觉 卷积神经网络 TRANSFORMER FPGA 硬件加速器
下载PDF
轻量级卷积神经网络的硬件加速方法
14
作者 吕文浩 支小莉 童维勤 《计算机工程与设计》 北大核心 2024年第3期699-706,共8页
为提升轻量级卷积神经网络在硬件平台的资源利用效率和推理速度,基于软硬件协同优化的思想,提出一种面向FPGA平台的轻量级卷积神经网络加速器,并针对网络结构的特性设计专门的硬件架构。与多级并行策略结合,设计一种统一的卷积层计算单... 为提升轻量级卷积神经网络在硬件平台的资源利用效率和推理速度,基于软硬件协同优化的思想,提出一种面向FPGA平台的轻量级卷积神经网络加速器,并针对网络结构的特性设计专门的硬件架构。与多级并行策略结合,设计一种统一的卷积层计算单元。为降低模型存储成本、提高加速器的吞吐量,提出一种基于可微阈值的选择性移位量化方案,使计算单元能够以硬件友好的形式执行计算。实验结果表明,在Arria 10 FPGA平台上部署的MobileNetV2加速器能够达到311 fps的推理速度,相比CPU版本实现了约9.3倍的加速比、GPU版本约3倍的加速比。在吞吐量方面,加速器能够实现98.62 GOPS。 展开更多
关键词 软硬件协同优化 现场可编程门阵列 轻量级卷积神经网络 移位量化 并行计算 硬件加速 开放式计算语言
下载PDF
基于FPGA的软硬件协同纠删码编码加速方案
15
作者 杨思捷 陈俊奇 +1 位作者 王勇 李树林 《计算机工程》 CAS CSCD 北大核心 2024年第2期224-231,共8页
纠删码容错技术已广泛应用于分布式存储系统,相较于多副本容错技术能显著降低数据存储成本,并且具有更高的数据通信可靠性和安全性,但在数据存储过程中不可避免地会引入额外的计算开销并增加编码时延,导致数据写入吞吐量降低。针对该问... 纠删码容错技术已广泛应用于分布式存储系统,相较于多副本容错技术能显著降低数据存储成本,并且具有更高的数据通信可靠性和安全性,但在数据存储过程中不可避免地会引入额外的计算开销并增加编码时延,导致数据写入吞吐量降低。针对该问题,提出一种基于现场可编程门列阵(FPGA)的纠删码编码加速方案。首先,利用FPGA的高速并行计算优势对纠删码算法进行硬件加速,并实现并行处理和时序优化。然后,针对上位机与FPGA之间因传输速率和处理速率不一致造成内存中的数据溢出问题,在FPGA上拓展了片外DDR3接口用于数据缓存,提高了通信可靠性,并利用DDR3的随机存取特点实现对数据块的分片。最后,设计基于FPGA的纠删码编码硬件加速架构进行实验验证。实验结果表明,与主流Jerasure 2.0开源纠删码库相比,该方案的数据写入吞吐量提升了2.7~93.0倍,尤其对于小文件的编码写入性能提升更为显著。 展开更多
关键词 纠删码 现场可编程门阵列 硬件加速 分布式存储 模块化设计
下载PDF
zk-SNARK中数论变换的硬件加速方法研究 被引量:2
16
作者 赵海旭 柴志雷 +2 位作者 花鹏程 王锋 丁冬 《计算机科学与探索》 CSCD 北大核心 2024年第2期538-552,共15页
简洁非交互式零知识证明能够生成长度固定的证明并快速进行验证,极大地推动了零知识证明在数字签名、区块链及分布式存储等领域的应用。但其证明的生成过程极其耗时且需要被频繁调用,其中数论变换是证明生成过程的主要运算之一。然而现... 简洁非交互式零知识证明能够生成长度固定的证明并快速进行验证,极大地推动了零知识证明在数字签名、区块链及分布式存储等领域的应用。但其证明的生成过程极其耗时且需要被频繁调用,其中数论变换是证明生成过程的主要运算之一。然而现有的通用数论变换硬件加速方法难以满足其在简洁非交互式零知识证明中大规模、高位宽的要求。针对该问题,提出一种数论变换多级流水硬件计算架构。针对高位宽计算需求对高位模运算进行优化,设计了低时延蒙哥马利模乘单元;为了加速大规模计算,通过二维子任务划分将大规模数论变换任务划分为小规模独立子任务,并通过消除数据依赖实现了子任务间计算流水;在子任务多轮蝶形运算之间采用数据重排机制,有效缓解了访存需求并实现了不同步长蝶形运算间的计算流水。所提出的数论变换计算架构可以根据现场可编程门阵列(FPGA)片上资源灵活扩展,方便部署在不同规模的FPGA上以获得最大加速效果。所提出的硬件架构使用高层次综合(HLS)开发并基于OpenCL框架在AMD Xilinx Alveo U50实现了整套异构加速系统。实验结果表明,相比于PipeZK中的数论变换加速模块,该方法获得了1.95倍的加速比;在运行当前主流的简洁非交互式零知识证明开源项目bellman时,相比于AMD Ryzen 95900X单核及12核分别获得了27.98倍和1.74倍的加速比,并分别获得了6.9倍、6倍的能效提升。 展开更多
关键词 现场可编程门阵列(FPGA) 简洁非交互式零知识证明(zk-SNARK) 模乘 数论变换 硬件加速
下载PDF
FPGA implementation of bit-stream neuron and perceptron based on sigma delta modulation
17
作者 梁勇 王志功 +1 位作者 孟桥 郭晓丹 《Journal of Southeast University(English Edition)》 EI CAS 2012年第3期282-286,共5页
To solve the excessive huge scale problem of the traditional multi-bit digital artificial neural network(ANN) hardware implementation methods,a bit-stream ANN hardware implementation method based on sigma delta(Σ... To solve the excessive huge scale problem of the traditional multi-bit digital artificial neural network(ANN) hardware implementation methods,a bit-stream ANN hardware implementation method based on sigma delta(ΣΔ) modulation is presented.The bit-stream adder,multiplier,threshold function unit and fully digital ΣΔ modulator are implemented in a field programmable gate array(FPGA),and these bit-stream arithmetical units are employed to build the bit-stream artificial neuron.The function of the bit-stream artificial neuron is verified through the realization of the logic function and a linear classifier.The bit-stream perceptron based on the bit-stream artificial neuron with the pre-processed structure is proved to have the ability of nonlinear classification.The FPGA resource utilization of the bit-stream artificial neuron shows that the bit-stream ANN hardware implementation method can significantly reduce the demand of the ANN hardware resources. 展开更多
关键词 bit-stream artificial neuron PERCEPTRON sigma delta field programmable gate array(FPGA)
下载PDF
Machine learning algorithm partially reconfigured on FPGA for an image edge detection system
18
作者 Gracieth Cavalcanti Batista Johnny Oberg +3 位作者 Osamu Saotome Haroldo F.de Campos Velho Elcio Hideiti Shiguemori Ingemar Soderquist 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第2期48-68,共21页
Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for... Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time. 展开更多
关键词 Dynamic partial reconfiguration(DPR) field programmable gate array(FPGA)implementation Image edge detection Support vector regression(SVR) Unmanned aerial vehicle(UAV) pose estimation
下载PDF
FPGA平台上动态硬件重构的Winograd神经网络加速器
19
作者 梅冰笑 滕文彬 +3 位作者 张弛 王文浩 李富强 苑福利 《计算机工程与应用》 CSCD 北大核心 2024年第22期323-334,共12页
为解决卷积神经网络在FPGA平台上进行硬件加速时存在的资源利用率低和资源受限问题,提出了一种基于FPGA动态部分重构技术和Winograd快速卷积的卷积神经网络加速器。该加速器通过运行时硬件重构对FPGA片上资源进行时分复用,采用流水线方... 为解决卷积神经网络在FPGA平台上进行硬件加速时存在的资源利用率低和资源受限问题,提出了一种基于FPGA动态部分重构技术和Winograd快速卷积的卷积神经网络加速器。该加速器通过运行时硬件重构对FPGA片上资源进行时分复用,采用流水线方式动态地将各个计算流水段配置到FPGA,各个流水段所对应的卷积计算核心使用Winograd算法进行定制优化,以在解决资源受限问题的同时最大程度地提升计算资源利用效率。针对该加速器架构,进一步构建了组合优化模型,用于搜索在特定FPGA硬件平台上部署特定网络模型的最优并行策略,并使用遗传算法进行设计空间求解。基于Xilinx VC709 FPGA平台对VGG-16网络模型进行部署和分析,综合仿真结果表明,所提出的设计方法能够在资源有限的FPGA上自适应地实现大型神经网络模型,加速器整体性能可以达到1078.3 GOPS,较以往加速器的性能和计算资源利用效率可以分别提升2.2倍和3.62倍。 展开更多
关键词 卷积神经网络 动态部分硬件重构 现场可编程门阵列(FPGA) 硬件加速器 Winograd快速卷积
下载PDF
APPLICATION AND IMPLEMENTATION OF REAL-TIME IMAGE ZOOMING WITH THE WALSH TECHNIQUE
20
作者 江洁 郁道银 《Transactions of Tianjin University》 EI CAS 2001年第4期229-232,共4页
Based on the study of Walsh transformation,the zooming template of a two dimensional superimposure filter is decomposed and simplified,and it is real time implemented with FPGA.This method is simple and effective.Th... Based on the study of Walsh transformation,the zooming template of a two dimensional superimposure filter is decomposed and simplified,and it is real time implemented with FPGA.This method is simple and effective.The quality of the image is very good. 展开更多
关键词 image zooming two dimensional superimposure filter field programmable gate array (FPGA)
下载PDF
上一页 1 2 20 下一页 到第
使用帮助 返回顶部