期刊文献+
共找到183篇文章
< 1 2 10 >
每页显示 20 50 100
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification
1
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight Convolutional Neural Network Depthwise Dilated Separable Convolution Hierarchical multi-scale feature fusion
下载PDF
Attention Guided Multi Scale Feature Fusion Network for Automatic Prostate Segmentation
2
作者 Yuchun Li Mengxing Huang +1 位作者 Yu Zhang Zhiming Bai 《Computers, Materials & Continua》 SCIE EI 2024年第2期1649-1668,共20页
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta... The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation. 展开更多
关键词 Prostate segmentation multi-scale attention 3D Transformer feature fusion MRI
下载PDF
Disparity estimation for multi-scale multi-sensor fusion
3
作者 SUN Guoliang PEI Shanshan +2 位作者 LONG Qian ZHENG Sifa YANG Rui 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期259-274,共16页
The perception module of advanced driver assistance systems plays a vital role.Perception schemes often use a single sensor for data processing and environmental perception or adopt the information processing results ... The perception module of advanced driver assistance systems plays a vital role.Perception schemes often use a single sensor for data processing and environmental perception or adopt the information processing results of various sensors for the fusion of the detection layer.This paper proposes a multi-scale and multi-sensor data fusion strategy in the front end of perception and accomplishes a multi-sensor function disparity map generation scheme.A binocular stereo vision sensor composed of two cameras and a light deterction and ranging(LiDAR)sensor is used to jointly perceive the environment,and a multi-scale fusion scheme is employed to improve the accuracy of the disparity map.This solution not only has the advantages of dense perception of binocular stereo vision sensors but also considers the perception accuracy of LiDAR sensors.Experiments demonstrate that the multi-scale multi-sensor scheme proposed in this paper significantly improves disparity map estimation. 展开更多
关键词 stereo vision light deterction and ranging(LiDAR) multi-sensor fusion multi-scale fusion disparity map
下载PDF
Ship recognition based on HRRP via multi-scale sparse preserving method
4
作者 YANG Xueling ZHANG Gong SONG Hu 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第3期599-608,共10页
In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) ba... In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) based on the maximum margin criterion(MMC) is proposed for recognizing the class of ship targets utilizing the high-resolution range profile(HRRP). Multi-scale fusion is introduced to capture the local and detailed information in small-scale features, and the global and contour information in large-scale features, offering help to extract the edge information from sea clutter and further improving the target recognition accuracy. The proposed method can maximally preserve the multi-scale fusion sparse of data and maximize the class separability in the reduced dimensionality by reproducing kernel Hilbert space. Experimental results on the measured radar data show that the proposed method can effectively extract the features of ship target from sea clutter, further reduce the feature dimensionality, and improve target recognition performance. 展开更多
关键词 ship target recognition high-resolution range profile(HRRP) multi-scale fusion kernel sparse preserving projection(MSFKSPP) feature extraction dimensionality reduction
下载PDF
Feature Fusion-Based Deep Learning Network to Recognize Table Tennis Actions
5
作者 Chih-Ta Yen Tz-Yun Chen +1 位作者 Un-Hung Chen Guo-Chang WangZong-Xian Chen 《Computers, Materials & Continua》 SCIE EI 2023年第1期83-99,共17页
A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.M... A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.Multiple kernel sizes were used in convolutional neural network(CNN)to evaluate their performance for extracting features.Moreover,a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner.The CNN achieved recognition of the four table tennis strokes.Experimental data were obtained from20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment.The data were collected to verify the performance of the proposed models for wearable devices.Finally,the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58%and 99.16%,respectively,for the four strokes.The accuracy for five-fold cross validation was 99.87%.This result also shows that the multi-scale convolutional neural network has better robustness after fivefold cross validation. 展开更多
关键词 Wearable devices deep learning six-axis sensor feature fusion multi-scale convolutional neural networks action recognit
下载PDF
Grasp Detection with Hierarchical Multi-Scale Feature Fusion and Inverted Shuffle Residual
6
作者 Wenjie Geng Zhiqiang Cao +3 位作者 Peiyu Guan Fengshui Jing Min Tan Junzhi Yu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第1期244-256,共13页
Grasp detection plays a critical role for robot manipulation.Mainstream pixel-wise grasp detection networks with encoder-decoder structure receive much attention due to good accuracy and efficiency.However,they usuall... Grasp detection plays a critical role for robot manipulation.Mainstream pixel-wise grasp detection networks with encoder-decoder structure receive much attention due to good accuracy and efficiency.However,they usually transmit the high-level feature in the encoder to the decoder,and low-level features are neglected.It is noted that low-level features contain abundant detail information,and how to fully exploit low-level features remains unsolved.Meanwhile,the channel information in high-level feature is also not well mined.Inevitably,the performance of grasp detection is degraded.To solve these problems,we propose a grasp detection network with hierarchical multi-scale feature fusion and inverted shuffle residual.Both low-level and high-level features in the encoder are firstly fused by the designed skip connections with attention module,and the fused information is then propagated to corresponding layers of the decoder for in-depth feature fusion.Such a hierarchical fusion guarantees the quality of grasp prediction.Furthermore,an inverted shuffle residual module is created,where the high-level feature from encoder is split in channel and the resultant split features are processed in their respective branches.By such differentiation processing,more high-dimensional channel information is kept,which enhances the representation ability of the network.Besides,an information enhancement module is added before the encoder to reinforce input information.The proposed method attains 98.9%and 97.8%in image-wise and object-wise accuracy on the Cornell grasping dataset,respectively,and the experimental results verify the effectiveness of the method. 展开更多
关键词 grasp detection hierarchical multi-scale feature fusion skip connections with attention inverted shuffle residual
原文传递
Multi-Scale Feature Fusion Model for Bridge Appearance Defect Detection
7
作者 Rong Pang Yan Yang +3 位作者 Aiguo Huang Yan Liu Peng Zhang Guangwu Tang 《Big Data Mining and Analytics》 EI CSCD 2024年第1期1-11,共11页
Although the Faster Region-based Convolutional Neural Network(Faster R-CNN)model has obvious advantages in defect recognition,it still cannot overcome challenging problems,such as time-consuming,small targets,irregula... Although the Faster Region-based Convolutional Neural Network(Faster R-CNN)model has obvious advantages in defect recognition,it still cannot overcome challenging problems,such as time-consuming,small targets,irregular shapes,and strong noise interference in bridge defect detection.To deal with these issues,this paper proposes a novel Multi-scale Feature Fusion(MFF)model for bridge appearance disease detection.First,the Faster R-CNN model adopts Region Of Interest(ROl)pooling,which omits the edge information of the target area,resulting in some missed detections and inaccuracies in both detecting and localizing bridge defects.Therefore,this paper proposes an MFF based on regional feature Aggregation(MFF-A),which reduces the missed detection rate of bridge defect detection and improves the positioning accuracy of the target area.Second,the Faster R-CNN model is insensitive to small targets,irregular shapes,and strong noises in bridge defect detection,which results in a long training time and low recognition accuracy.Accordingly,a novel Lightweight MFF(namely MFF-L)model for bridge appearance defect detection using a lightweight network EfficientNetV2 and a feature pyramid network is proposed,which fuses multi-scale features to shorten the training speed and improve recognition accuracy.Finally,the effectiveness of the proposed method is evaluated on the bridge disease dataset and public computational fluid dynamic dataset. 展开更多
关键词 defect detection multi-scale feature fusion(MFF) Region Of Interest(ROl)alignment lightweight network
原文传递
Neighborhood fusion-based hierarchical parallel feature pyramid network for object detection 被引量:3
8
作者 Mo Lingfei Hu Shuming 《Journal of Southeast University(English Edition)》 EI CAS 2020年第3期252-263,共12页
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid... In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy. 展开更多
关键词 computer vision deep convolutional neural network object detection hierarchical parallel feature pyramid network multi-scale feature fusion
下载PDF
Assessing Landsat-8 and Sentinel-2 spectral-temporal features for mapping tree species of northern plantation forests in Heilongjiang Province,China 被引量:3
9
作者 Mengyu Wang Yi Zheng +7 位作者 Chengquan Huang Ran Meng Yong Pang Wen Jia Jie Zhou Zehua Huang Linchuan Fang Feng Zhao 《Forest Ecosystems》 SCIE CSCD 2022年第3期344-356,共13页
Background:Accurate mapping of tree species is highly desired in the management and research of plantation forests,whose ecosystem services are currently under threats.Time-series multispectral satellite images,e.g.,f... Background:Accurate mapping of tree species is highly desired in the management and research of plantation forests,whose ecosystem services are currently under threats.Time-series multispectral satellite images,e.g.,from Landsat-8(L8)and Sentinel-2(S2),have been proven useful in mapping general forest types,yet we do not know quantitatively how their spectral features(e.g.,red-edge)and temporal frequency of data acquisitions(e.g.,16-day vs.5-day)contribute to plantation forest mapping to the species level.Moreover,it is unclear to what extent the fusion of L8 and S2 will result in improvements in tree species mapping of northern plantation forests in China.Methods:We designed three sets of classification experiments(i.e.,single-date,multi-date,and spectral-temporal)to evaluate the performances of L8 and S2 data for mapping keystone timber tree species in northern China.We first used seven pairs of L8 and S2 images to evaluate the performances of L8 and S2 key spectral features for separating these tree species across key growing stages.Then we extracted the spectral-temporal features from all available images of different temporal frequency of data acquisition(i.e.,L8 time series,S2 time series,and fusion of L8 and S2)to assess the contribution of image temporal frequency on the accuracy of tree species mapping in the study area.Results:1)S2 outperformed L8 images in all classification experiments,with or without the red edge bands(0.4%–3.4%and 0.2%–4.4%higher for overall accuracy and macro-F1,respectively);2)NDTI(the ratio of SWIR1 minus SWIR2 to SWIR1 plus SWIR2)and Tasseled Cap coefficients were most important features in all the classifications,and for time-series experiments,the spectral-temporal features of red band-related vegetation indices were most useful;3)increasing the temporal frequency of data acquisition can improve overall accuracy of tree species mapping for up to 3.2%(from 90.1%using single-date imagery to 93.3%using S2 time-series),yet similar overall accuracies were achieved using S2 time-series(93.3%)and the fusion of S2 and L8(93.2%).Conclusions:This study quantifies the contributions of L8 and S2 spectral and temporal features in mapping keystone tree species of northern plantation forests in China and suggests that for mapping tree species in China's northern plantation forests,the effects of increasing the temporal frequency of data acquisition could saturate quickly after using only two images from key phenological stages. 展开更多
关键词 Tree species mapping Plantation forests Red-edge features Temporal frequency of data acquisition fusion of Landsat-8 and Sentinel-2
下载PDF
Industrial Fusion Cascade Detection of Solder Joint
10
作者 Chunyuan Li Peng Zhang +2 位作者 Shuangming Wang Lie Liu Mingquan Shi 《Computers, Materials & Continua》 SCIE EI 2024年第10期1197-1214,共18页
With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,de... With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area. 展开更多
关键词 Cascade object detection deep learning feature fusion multi-scale and multi-template matching solder joint dataset
下载PDF
Vibration Feature Fusion for State Evaluation of Machinery 被引量:1
11
作者 李康 林习良 +1 位作者 胡湘江 蔡自刚 《Journal of Donghua University(English Edition)》 EI CAS 2015年第2期244-247,共4页
To overcome the problem that a single feature can not reflect the state of machinery in different stages,a method of vibration feature fusion based on self-organizing map(SOM) is presented.Minimum quantization error(M... To overcome the problem that a single feature can not reflect the state of machinery in different stages,a method of vibration feature fusion based on self-organizing map(SOM) is presented.Minimum quantization error(MQE) is obtained unsupervised based on SOM network.And trend information of the MQE curve is extracted by the wavelet packet to enhance state differentiating.Experimental flat is designed for bearing accelerating fatigue.And experimental results show that the method of vibration feature fusion based on SOM can reflect the state of machinery in different stages effectively. 展开更多
关键词 wavelet overcome organizing quantization accelerating neighbor machinery behave packet restrain
下载PDF
Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion
12
作者 Mingdi HU Long BAI +2 位作者 Jiulun FAN Sirui ZHAO Enhong CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第3期91-102,共12页
Vehicle Color Recognition(VCR)plays a vital role in intelligent traffic management and criminal investigation assistance.However,the existing vehicle color datasets only cover 13 classes,which can not meet the current... Vehicle Color Recognition(VCR)plays a vital role in intelligent traffic management and criminal investigation assistance.However,the existing vehicle color datasets only cover 13 classes,which can not meet the current actual demand.Besides,although lots of efforts are devoted to VCR,they suffer from the problem of class imbalance in datasets.To address these challenges,in this paper,we propose a novel VCR method based on Smooth Modulation Neural Network with Multi-Scale Feature Fusion(SMNN-MSFF).Specifically,to construct the benchmark of model training and evaluation,we first present a new VCR dataset with 24 vehicle classes,Vehicle Color-24,consisting of 10091 vehicle images from a 100-hour urban road surveillance video.Then,to tackle the problem of long-tail distribution and improve the recognition performance,we propose the SMNN-MSFF model with multiscale feature fusion and smooth modulation.The former aims to extract feature information from local to global,and the latter could increase the loss of the images of tail class instances for training with class-imbalance.Finally,comprehensive experimental evaluation on Vehicle Color-24 and previously three representative datasets demonstrate that our proposed SMNN-MSFF outperformed state-of-the-art VCR methods.And extensive ablation studies also demonstrate that each module of our method is effective,especially,the smooth modulation efficiently help feature learning of the minority or tail classes.Vehicle Color-24 and the code of SMNN-MSFF are publicly available and can contact the author to obtain. 展开更多
关键词 vehicle color recognition benchmark dataset multi-scale feature fusion long-tail distribution improved smooth l1 loss
原文传递
Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs 被引量:3
13
作者 Lei Fu Wen-bin Gu +3 位作者 Wei Li Liang Chen Yong-bao Ai Hua-lei Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第4期1531-1541,共11页
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa... In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs. 展开更多
关键词 Aerial images Object detection feature pyramid networks multi-scale feature fusion Swarm UAVs
下载PDF
基于数据驱动的分布式光伏发电功率预测方法研究进展 被引量:2
14
作者 董明 李晓枫 +4 位作者 杨章 常益 任明 张崇兴 焦在滨 《电网与清洁能源》 CSCD 北大核心 2024年第1期8-17,28,共11页
从综述的角度,以分布式光伏系统为对象,分析了功率预测技术的发展情况、存在的难点以及主要影响因素,梳理了应用数据驱动方法实现功率准确预测的技术路线。再以空间相关性、历史出力功率以及气象等影响因素为切入点,梳理了光伏系统数据... 从综述的角度,以分布式光伏系统为对象,分析了功率预测技术的发展情况、存在的难点以及主要影响因素,梳理了应用数据驱动方法实现功率准确预测的技术路线。再以空间相关性、历史出力功率以及气象等影响因素为切入点,梳理了光伏系统数据驱动的功率预测研究现状,分析其相应的数据增强、时空图信息以及特征融合的手段,讨论了技术的优缺点。最后给出了功率预测数据驱动方法研究方向和发展建议。 展开更多
关键词 分布式光伏出力特性 数据驱动 数据增强 时空图信息 特征融合
下载PDF
改进YOLOv5的无人机航拍图像目标检测算法 被引量:1
15
作者 李校林 刘大东 +1 位作者 刘鑫满 陈泽 《计算机工程与应用》 CSCD 北大核心 2024年第11期204-214,共11页
针对无人机航拍图像目标检测中目标尺度多样、相似目标众多、目标聚集导致的目标漏检、误检问题,提出了改进YOLOv5的无人机航拍图像目标检测算法DA-YOLO。提出由特征图注意力生成器和动态权重学习模块组成的多尺度动态特征加权融合网络... 针对无人机航拍图像目标检测中目标尺度多样、相似目标众多、目标聚集导致的目标漏检、误检问题,提出了改进YOLOv5的无人机航拍图像目标检测算法DA-YOLO。提出由特征图注意力生成器和动态权重学习模块组成的多尺度动态特征加权融合网络,特征图注意力生成器融合处理不同尺度目标更重要的特征,权重学习模块自适应地调节对不同尺度目标特征的学习,该网络可增强在目标尺度多样下的辨识度从而降低目标漏检。设计一种并行选择性注意力机制(PSAM)添加到特征提取网络中,该模块通过动态融合空间信息和通道信息,加强特征的表达获得更优质的特征图,提高网络对相似目标的区分能力以减少误检。使用Soft-NMS代替YOLOv5中采用的非极大值抑制(NMS)以改善目标聚集场景下的漏检、误检。实验结果表明,改进算法在VisDrone数据集上检测精度达到37.79%,相比于YOLOv5s算法精度提高了5.59个百分点,改进后的算法可以更好地应用于无人机航拍图像目标检测中。 展开更多
关键词 无人机航拍图像处理 特征图注意力生成器 动态特征加权融合 注意力机制 非极大值抑制
下载PDF
FPN算法在视觉感知机器人抓取控制的应用研究
16
作者 王利祥 郭向伟 卢明星 《机械设计与制造》 北大核心 2024年第4期303-307,313,共6页
针对视觉感知机器人对物体抓取的准确性控制,在抓取姿势估计基础上使用密集连接的特征金字塔网络(FPN)作为特征提取器,将语义更强的高级特征图与分辨率更高的低级特征图融合,将机器人物体抓取过程分为两个阶段,第一个阶段生成待抓取区域... 针对视觉感知机器人对物体抓取的准确性控制,在抓取姿势估计基础上使用密集连接的特征金字塔网络(FPN)作为特征提取器,将语义更强的高级特征图与分辨率更高的低级特征图融合,将机器人物体抓取过程分为两个阶段,第一个阶段生成待抓取区域,第二阶段对抓取区域进行细化以预测抓取姿势。模型在Cornell抓取数据集和Jacquard数据集上训练,验证了所提算法在抓取姿势估计的有效性。设计了两种不同真实场景的物体抓取控制实验,结果表明所提模型能有效提高机器人抓取各种不同尺寸物体的能力。 展开更多
关键词 视觉机器人 抓取姿势 FPN 特征图融合
下载PDF
多视野精细分析下的弱监督目标定位算法
17
作者 张英俊 贾聪聪 谢斌红 《计算机工程与设计》 北大核心 2024年第6期1750-1756,共7页
针对多尺度目标定位精度较差,难以捕获完整目标边界的问题,设计一种多视野精细分析模块并融入通道与空间注意力机制抑制背景噪声的干扰,获取多尺度目标的高分辨率特征。利用随机特征选取模块获取特征图随机位置的组合,聚合多个位置图获... 针对多尺度目标定位精度较差,难以捕获完整目标边界的问题,设计一种多视野精细分析模块并融入通道与空间注意力机制抑制背景噪声的干扰,获取多尺度目标的高分辨率特征。利用随机特征选取模块获取特征图随机位置的组合,聚合多个位置图获取最具辨别性的位置及其它位置的信息,融合浅层生成的类激活图与聚合类激活图获取细粒度位置信息,捕获完整的目标边界。与现有的弱监督定位方法相比,在解决多尺度目标定位效果差和局部最优问题上具有一定的优势。 展开更多
关键词 弱监督学习 目标定位 多尺度特征融合 注意力机制 全局平均池化 类激活图 正则化
下载PDF
基于锚点的快速三维手部关键点检测算法
18
作者 秦晓飞 何文 +2 位作者 班东贤 郭宏宇 于景 《电子科技》 2024年第4期77-86,共10页
在人机协作任务中,手部关键点检测为机械臂提供目标点坐标,A2J(Anchor-to-Joint)是具有代表性的一种利用锚点进行关键点检测的方法。A2J以深度图为输入,可实现较好的检测效果,但对全局特征获取能力不足。文中设计了全局-局部特征融合模... 在人机协作任务中,手部关键点检测为机械臂提供目标点坐标,A2J(Anchor-to-Joint)是具有代表性的一种利用锚点进行关键点检测的方法。A2J以深度图为输入,可实现较好的检测效果,但对全局特征获取能力不足。文中设计了全局-局部特征融合模块(Global-Local Feature Fusion,GLFF)对骨干网络浅层和深层的特征进行融合。为了提升检测速度,文中将A2J的骨干网络替换为ShuffleNetv2并对其进行改造,用5×5深度可分离卷积替换3×3深度可分离卷积,增大感受野,有效提升了骨干网络对全局特征的提取能力。文中在锚点权重估计分支引入高效通道注意力模块(Efficient Channel Attention,ECA),提升了网络对重要锚点的关注度。在主流数据集ICVL和NYU上进行的训练和测试结果表明,相比于A2J,文中所提方法的平均误差分别降低了0.09 mm和0.15 mm。在GTX1080Ti显卡上实现了151 frame·s^(-1)的检测速率,满足人机协作任务对于实时性的要求。 展开更多
关键词 人机协作 三维手部关键点检测 锚点 深度图 全局-局部特征融合 ShuffleNetv2 深度可分离卷积 高效通道注意力
下载PDF
Face anti-spoofing based on multi-modal and multi-scale features fusion
19
作者 Kong Chao Ou Weihua +4 位作者 Gong Xiaofeng Li Weian Han Jie Yao Yi Xiong Jiahao 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2022年第6期73-82,共10页
Face anti-spoofing is used to assist face recognition system to judge whether the detected face is real face or fake face. In the traditional face anti-spoofing methods, features extracted by hand are used to describe... Face anti-spoofing is used to assist face recognition system to judge whether the detected face is real face or fake face. In the traditional face anti-spoofing methods, features extracted by hand are used to describe the difference between living face and fraudulent face. But these handmade features do not apply to different variations in an unconstrained environment. The convolutional neural network(CNN) for face deceptions achieves considerable results. However, most existing neural network-based methods simply use neural networks to extract single-scale features from single-modal data, while ignoring multi-scale and multi-modal information. To address this problem, a novel face anti-spoofing method based on multi-modal and multi-scale features fusion(MMFF) is proposed. Specifically, first residual network(Resnet)-34 is adopted to extract features of different scales from each modality, then these features of different scales are fused by feature pyramid network(FPN), finally squeeze-and-excitation fusion(SEF) module and self-attention network(SAN) are combined to fuse features from different modalities for classification. Experiments on the CASIA-SURF dataset show that the new method based on MMFF achieves better performance compared with most existing methods. 展开更多
关键词 face anti-spoofing multi-modal fusion multi-scale fusion self-attention network(SAN) feature pyramid network(FPN)
原文传递
基于雷达和视频融合的目标检测 被引量:1
20
作者 朱勇 黄永明 何幸 《电子科技》 2024年第8期1-7,共7页
基于视频的目标检测在恶劣天气情况下识别效果较差,故需弥补视频缺陷、提高检测框架的鲁棒性。针对此问题,文中设计了一个基于雷达和视频融合的目标检测框架,利用YOLOv5(You Only Look Once version 5)网络获得图片特征图与图片检测框,... 基于视频的目标检测在恶劣天气情况下识别效果较差,故需弥补视频缺陷、提高检测框架的鲁棒性。针对此问题,文中设计了一个基于雷达和视频融合的目标检测框架,利用YOLOv5(You Only Look Once version 5)网络获得图片特征图与图片检测框,利用基于密度的聚类获得雷达检测框,并将雷达数据进行编码,得到基于雷达信息的目标检测结果。最后将两者的检测框叠加得到新ROI(Region of Interest),并得到融合雷达信息后的分类向量,提高了在极端天气下检测的准确率。实验结果表明,该框架的mAP(mean Average Precision)达到了60.07%,且参数量仅为7.64×10^(6),表明该框架具有轻量级、计算速度快、鲁棒性高等特点,可以被广泛应用于嵌入式与移动端平台。 展开更多
关键词 传感器融合 雷达信号处理 雷达特征图提取 DBSCAN 卡尔曼滤波 目标检测 YOLOv5 R-CNN
下载PDF
上一页 1 2 10 下一页 到第
使用帮助 返回顶部