The hardware optimization technique of mono similarity system generation is presented based on hardware/software(HW/SW) co design.First,the coarse structure of sub graphs' matching based on full customized HW...The hardware optimization technique of mono similarity system generation is presented based on hardware/software(HW/SW) co design.First,the coarse structure of sub graphs' matching based on full customized HW/SW co design is put forward.Then,a universal sub graphs' combination method is discussed.Next,a more advanced vertexes' compression algorithm based on sub graphs' combination method is discussed with great emphasis.Experiments are done successfully with perfect results verifying all the formulas and the methods above.展开更多
In this paper,a software/hardware High-level Synthesis(HLS)design is proposed to compute the Adaptive Vector Median Filter(AVMF)in realtime.In fact,this filter is known by its excellent impulsive noise suppression and...In this paper,a software/hardware High-level Synthesis(HLS)design is proposed to compute the Adaptive Vector Median Filter(AVMF)in realtime.In fact,this filter is known by its excellent impulsive noise suppression and chromaticity conservation.The software(SW)study of this filter demonstrates that its implementation is too complex.The purpose of this work is to study the impact of using an HLS tool to design ideal floating-point and optimized fixed-point hardware(HW)architectures for the AVMF filter using square root function(ideal HW)and ROM memory(optimized HW),respectively,to select the best HLS architectures and to design an efficient HLS software/hardware(SW/HW)embedded AVMF design to achieve a trade-off between the processing time,power consumption and hardware cost.For that purpose,some approximations using ROM memory were proposed to perform the square root and develop a fixed-point AVMF algorithm.After that,the best solution generated for each HLS design was integrated in the SW/HW environment and evaluated under ZC702 FPGA platform.The experimental results showed a reduction of about 65%and 98%in both the power consumption and processing time for the ideal SW/HW implementation relative to the ideal SW implementation for an AVMF filter with the same image quality,respectively.Moreover,the power consumption and processing time of the optimized SW/HW are 70%and 97%less than the optimized SW implementation,respectively.In addition,the Look Up Table(LUTs)percentage,power consumption and processing time used by the optimized SW/HW design are improved by nearly 45%,18%and 61%compared the ideal SW/HW design,respectively,with slight decrease in the image quality.展开更多
In the context of constructing an embedded system to help visually impaired people to interpret text,in this paper,an efficient High-level synthesis(HLS)Hardware/Software(HW/SW)design for text extraction using the Gam...In the context of constructing an embedded system to help visually impaired people to interpret text,in this paper,an efficient High-level synthesis(HLS)Hardware/Software(HW/SW)design for text extraction using the Gamma Correction Method(GCM)is proposed.Indeed,the GCM is a common method used to extract text from a complex color image and video.The purpose of this work is to study the complexity of the GCM method on Xilinx ZCU102 FPGA board and to propose a HW implementation as Intellectual Property(IP)block of the critical blocks in this method using HLS flow with taking account the quality of the text extraction.This IP is integrated and connected to the ARM Cortex-A53 as coprocessor in HW/SW codesign context.The experimental results show that theHLS HW/SW implementation of the GCM method on ZCU102 FPGA board allows a reduction in processing time by about 89%compared to the SW implementation.This result is given for the same potency and strength of SW implementation for the text extraction.展开更多
Human detection is important in many applications and has attracted significant attention over the last decade. The Histograms of Oriented Gradients (HOG) as effective local descriptors are used with binary sliding wi...Human detection is important in many applications and has attracted significant attention over the last decade. The Histograms of Oriented Gradients (HOG) as effective local descriptors are used with binary sliding window mechanism to achieve good detection performance. However, the computation of HOG under such framework is about billion times and the pure software implementation for HOG computation is hard to meet the real-time requirement. This study proposes a hardware architecture called One-HOG accelerator operated on FPGA of Xilinx Spartan-6 LX-150T that provides an efficient way to compute HOG such that an embedded real-time platform of HW/SW co-design for application to crowd estimation and analysis is achieved. The One-HOG accelerator mainly consists of gradient module and histogram module. The gradient module is for computing gradient magnitude and orientation;histogram module is for generating a 36-D HOG feature vector. In addition to hardware realization, a new method called Histograms-of-Oriented-Gradients AdaBoost Long-Feature-Vector (HOG-AdaBoost-LFV) human classifier is proposed to significantly decrease the number of times to compute the HOG without sacrificing detection performance. The experiment results from three static image and four video datasets demonstrate that the proposed SW/HW (software/hardware) co-design system is 13.14 times faster than the pure software computation of Dalal algorithm.展开更多
文摘The hardware optimization technique of mono similarity system generation is presented based on hardware/software(HW/SW) co design.First,the coarse structure of sub graphs' matching based on full customized HW/SW co design is put forward.Then,a universal sub graphs' combination method is discussed.Next,a more advanced vertexes' compression algorithm based on sub graphs' combination method is discussed with great emphasis.Experiments are done successfully with perfect results verifying all the formulas and the methods above.
基金The authors extend their appreciation to the Deanship of Scientific Research at Jouf University(Kingdom of Saudi Arabia)for funding this work through research Grant No.DSR2020-06-3663.
文摘In this paper,a software/hardware High-level Synthesis(HLS)design is proposed to compute the Adaptive Vector Median Filter(AVMF)in realtime.In fact,this filter is known by its excellent impulsive noise suppression and chromaticity conservation.The software(SW)study of this filter demonstrates that its implementation is too complex.The purpose of this work is to study the impact of using an HLS tool to design ideal floating-point and optimized fixed-point hardware(HW)architectures for the AVMF filter using square root function(ideal HW)and ROM memory(optimized HW),respectively,to select the best HLS architectures and to design an efficient HLS software/hardware(SW/HW)embedded AVMF design to achieve a trade-off between the processing time,power consumption and hardware cost.For that purpose,some approximations using ROM memory were proposed to perform the square root and develop a fixed-point AVMF algorithm.After that,the best solution generated for each HLS design was integrated in the SW/HW environment and evaluated under ZC702 FPGA platform.The experimental results showed a reduction of about 65%and 98%in both the power consumption and processing time for the ideal SW/HW implementation relative to the ideal SW implementation for an AVMF filter with the same image quality,respectively.Moreover,the power consumption and processing time of the optimized SW/HW are 70%and 97%less than the optimized SW implementation,respectively.In addition,the Look Up Table(LUTs)percentage,power consumption and processing time used by the optimized SW/HW design are improved by nearly 45%,18%and 61%compared the ideal SW/HW design,respectively,with slight decrease in the image quality.
文摘In the context of constructing an embedded system to help visually impaired people to interpret text,in this paper,an efficient High-level synthesis(HLS)Hardware/Software(HW/SW)design for text extraction using the Gamma Correction Method(GCM)is proposed.Indeed,the GCM is a common method used to extract text from a complex color image and video.The purpose of this work is to study the complexity of the GCM method on Xilinx ZCU102 FPGA board and to propose a HW implementation as Intellectual Property(IP)block of the critical blocks in this method using HLS flow with taking account the quality of the text extraction.This IP is integrated and connected to the ARM Cortex-A53 as coprocessor in HW/SW codesign context.The experimental results show that theHLS HW/SW implementation of the GCM method on ZCU102 FPGA board allows a reduction in processing time by about 89%compared to the SW implementation.This result is given for the same potency and strength of SW implementation for the text extraction.
文摘Human detection is important in many applications and has attracted significant attention over the last decade. The Histograms of Oriented Gradients (HOG) as effective local descriptors are used with binary sliding window mechanism to achieve good detection performance. However, the computation of HOG under such framework is about billion times and the pure software implementation for HOG computation is hard to meet the real-time requirement. This study proposes a hardware architecture called One-HOG accelerator operated on FPGA of Xilinx Spartan-6 LX-150T that provides an efficient way to compute HOG such that an embedded real-time platform of HW/SW co-design for application to crowd estimation and analysis is achieved. The One-HOG accelerator mainly consists of gradient module and histogram module. The gradient module is for computing gradient magnitude and orientation;histogram module is for generating a 36-D HOG feature vector. In addition to hardware realization, a new method called Histograms-of-Oriented-Gradients AdaBoost Long-Feature-Vector (HOG-AdaBoost-LFV) human classifier is proposed to significantly decrease the number of times to compute the HOG without sacrificing detection performance. The experiment results from three static image and four video datasets demonstrate that the proposed SW/HW (software/hardware) co-design system is 13.14 times faster than the pure software computation of Dalal algorithm.