期刊文献+
共找到874篇文章
< 1 2 44 >
每页显示 20 50 100
Two-Layer Attention Feature Pyramid Network for Small Object Detection
1
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
下载PDF
Underwater Acoustic Signal Noise Reduction Based on a Fully Convolutional Encoder-Decoder Neural Network
2
作者 SONG Yongqiang CHU Qian +2 位作者 LIU Feng WANG Tao SHEN Tongsheng 《Journal of Ocean University of China》 SCIE CAS CSCD 2023年第6期1487-1496,共10页
Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological an... Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological and natural noise in the marine environ-ment.The feature extraction method combining time-frequency spectrograms and deep learning can effectively achieve the separation of noise and target signals.A fully convolutional encoder-decoder neural network(FCEDN)is proposed to address the issue of noise reduc-tion in underwater acoustic signals.The time-domain waveform map of underwater acoustic signals is converted into a wavelet low-frequency analysis recording spectrogram during the denoising process to preserve as many underwater acoustic signal characteristics as possible.The FCEDN is built to learn the spectrogram mapping between noise and target signals that can be learned at each time level.The transposed convolution transforms are introduced,which can transform the spectrogram features of the signals into listenable audio files.After evaluating the systems on the ShipsEar Dataset,the proposed method can increase SNR and SI-SNR by 10.02 and 9.5dB,re-spectively. 展开更多
关键词 deep learning convolutional encoder-decoder neural network wavelet low-frequency analysis recording spectrogram
下载PDF
Configuration Synthesis and Lightweight Networking of Deployable Mechanism Based on a Novel Pyramid Module
3
作者 Jinwei Guo Jianliang He +2 位作者 Guoxing Zhang Yongsheng Zhao Yundou Xu 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2023年第4期294-310,共17页
Deployable mechanism with preferable deployable performance,strong expansibility,and lightweight has attracted much attention because of their potential in aerospace.A basic deployable pyramid unit with good deployabi... Deployable mechanism with preferable deployable performance,strong expansibility,and lightweight has attracted much attention because of their potential in aerospace.A basic deployable pyramid unit with good deployability and expandability is proposed to construct a sizeable deployable mechanism.Firstly,the basic unit folding principle and expansion method is proposed.The configuration synthesis method of adding constraint chains of spatial closed-loop mechanism is used to synthesize the basic unit.Then,the degree of freedom of the basic unit is analyzed using the screw theory and the link dismantling method.Next,the three-dimensional models of the pyramid unit,expansion unit,and array unit are established,and the folding motion simulation analysis is carried out.Based on the number of components,weight reduction rate,and deployable rate,the performance characteristics of the three types of mechanisms are described in detail.Finally,prototypes of the pyramid unit,combination unit,and expansion unit are developed to verify further the correctness of the configuration synthesis based on the pyramid.The proposed deployable mechanism provides aference for the design and application of antennas with a large aperture,high deployable rate,and lightweight.It has a good application prospect in the aerospace field. 展开更多
关键词 pyramid unit Deployable mechanism Configuration synthesis Structural design Lightweight networking
下载PDF
PAN-DeSpeck:A Lightweight Pyramid and Attention-Based Network for SAR Image Despeckling
4
作者 Saima Yasmeen Muhammad Usman Yaseen +2 位作者 Syed Sohaib Ali Moustafa M.Nasralla Sohaib Bin Altaf Khattak 《Computers, Materials & Continua》 SCIE EI 2023年第9期3671-3689,共19页
SAR images commonly suffer fromspeckle noise,posing a significant challenge in their analysis and interpretation.Existing convolutional neural network(CNN)based despeckling methods have shown great performance in remo... SAR images commonly suffer fromspeckle noise,posing a significant challenge in their analysis and interpretation.Existing convolutional neural network(CNN)based despeckling methods have shown great performance in removing speckle noise.However,these CNN-basedmethods have a fewlimitations.They do not decouple complex background information in amulti-resolutionmanner.Moreover,they have deep network structures thatmay result in many parameters,limiting their applicability tomobile devices.Furthermore,extracting key speckle information in the presence of complex background is also a major problem with SAR.The proposed study addresses these limitations by introducing a lightweight pyramid and attention-based despeckling(PAN-Despeck)network.The primary objective is to enhance image quality and enable improved information interpretation,particularly on mobile devices and scenarios involving complex backgrounds.The PAN-Despeck network leverages domainspecific knowledge and integrates Gaussian Laplacian image pyramid decomposition for multi-resolution image analysis.By utilizing this approach,complex background information can be effectively decoupled,leading to enhanced despeckling performance.Furthermore,the attention mechanism selectively focuses on key speckle features and facilitates complex background removal.The network incorporates recursive and residual blocks to ensure computational efficiency and accelerate training speed,making it lightweight while maintaining high performance.Through comprehensive evaluations,it is demonstrated that PAN-Despeck outperforms existing image restoration methods.With an impressive average peak signal-to-noise ratio(PSNR)of 28.355114 and a remarkable structural similarity index(SSIM)of 0.905467,it demonstrates exceptional performance in effectively reducing speckle noise in SAR images.The source code for the PAN-DeSpeck network is available on GitHub. 展开更多
关键词 Synthetic Aperture Radar(SAR) SAR image despeckling speckle noise deep learning pyramid networks multiscale image despeckling
下载PDF
Deep Pyramidal Residual Network for Indoor-Outdoor Activity Recognition Based on Wearable Sensor
5
作者 Sakorn Mekruksavanich Narit Hnoohom Anuchit Jitpattanakul 《Intelligent Automation & Soft Computing》 SCIE 2023年第9期2669-2686,共18页
Recognition of human activity is one of the most exciting aspects of time-series classification,with substantial practical and theoretical impli-cations.Recent evidence indicates that activity recognition from wearabl... Recognition of human activity is one of the most exciting aspects of time-series classification,with substantial practical and theoretical impli-cations.Recent evidence indicates that activity recognition from wearable sensors is an effective technique for tracking elderly adults and children in indoor and outdoor environments.Consequently,researchers have demon-strated considerable passion for developing cutting-edge deep learning sys-tems capable of exploiting unprocessed sensor data from wearable devices and generating practical decision assistance in many contexts.This study provides a deep learning-based approach for recognizing indoor and outdoor movement utilizing an enhanced deep pyramidal residual model called Sen-PyramidNet and motion information from wearable sensors(accelerometer and gyroscope).The suggested technique develops a residual unit based on a deep pyramidal residual network and introduces the concept of a pyramidal residual unit to increase detection capability.The proposed deep learning-based model was assessed using the publicly available 19Nonsens dataset,which gathered motion signals from various indoor and outdoor activities,including practicing various body parts.The experimental findings demon-strate that the proposed approach can efficiently reuse characteristics and has achieved an identification accuracy of 96.37%for indoor and 97.25%for outdoor activity.Moreover,comparison experiments demonstrate that the SenPyramidNet surpasses other cutting-edge deep learning models in terms of accuracy and F1-score.Furthermore,this study explores the influence of several wearable sensors on indoor and outdoor action recognition ability. 展开更多
关键词 Human activity recognition deep learning wearable sensors indoor and outdoor activity deep pyramidal residual network
下载PDF
A Road Extraction Method for Remote Sensing Image Based on Encoder-Decoder Network 被引量:23
6
作者 Hao HE Shuyang WANG +2 位作者 Shicheng WANG Dongfang YANG Xing LIU 《Journal of Geodesy and Geoinformation Science》 2020年第2期16-25,共10页
According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are r... According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are rich in local details and simple in semantic features,an Encoder-Decoder network with shallow layers and high resolution is designed to improve the ability to represent detail information.Secondly,as the road area is a small proportion in remote sensing images,the cross-entropy loss function is improved,which solves the imbalance between positive and negative samples in the training process.Experiments on large road extraction datasets show that the proposed method gets the recall rate 83.9%,precision 82.5%and F1-score 82.9%,which can extract the road targets in remote sensing images completely and accurately.The Encoder-Decoder network designed in this paper performs well in the road extraction task and needs less artificial participation,so it has a good application prospect. 展开更多
关键词 remote sensing road extraction deep learning semantic segmentation encoder-decoder network
下载PDF
NFHP-RN:AMethod of Few-Shot Network Attack Detection Based on the Network Flow Holographic Picture-ResNet
7
作者 Tao Yi Xingshu Chen +2 位作者 Mingdong Yang Qindong Li Yi Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期929-955,共27页
Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to ... Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets. 展开更多
关键词 APT attacks spatial pyramid pooling NFHP(network flow holo-graphic picture) ResNet self-attention mechanism META-LEARNING
下载PDF
Classification of Arrhythmia Based on Convolutional Neural Networks and Encoder-Decoder Model
8
作者 Jian Liu Xiaodong Xia +2 位作者 Chunyang Han Jiao Hui Jim Feng 《Computers, Materials & Continua》 SCIE EI 2022年第10期265-278,共14页
As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical... As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F). 展开更多
关键词 ELECTROENCEPHALOGRAPHY convolutional neural network long short-term memory encoder-decoder model generative adversarial network
下载PDF
Multi-scale object detection by top-down and bottom-up feature pyramid network 被引量:13
9
作者 ZHAO Baojun ZHAO Boya +2 位作者 TANG Linbo WANG Wenzheng WU Chen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期1-12,共12页
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ... While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps. 展开更多
关键词 convolutional neural network (CNN) FEATURE pyramid network (FPN) object detection deconvolution.
下载PDF
Dual Attention Based Feature Pyramid Network 被引量:4
10
作者 Huijun Xing Shuai Wang +1 位作者 Dezhi Zheng Xiaotong Zhao 《China Communications》 SCIE CSCD 2020年第8期242-252,共11页
Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale... Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects. 展开更多
关键词 object detection convolutional neural networks feature pyramid
下载PDF
Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs 被引量:3
11
作者 Lei Fu Wen-bin Gu +3 位作者 Wei Li Liang Chen Yong-bao Ai Hua-lei Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第4期1531-1541,共11页
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa... In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs. 展开更多
关键词 Aerial images Object detection Feature pyramid networks Multi-scale feature fusion Swarm UAVs
下载PDF
Neighborhood fusion-based hierarchical parallel feature pyramid network for object detection 被引量:3
12
作者 Mo Lingfei Hu Shuming 《Journal of Southeast University(English Edition)》 EI CAS 2020年第3期252-263,共12页
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid... In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy. 展开更多
关键词 computer vision deep convolutional neural network object detection hierarchical parallel feature pyramid network multi-scale feature fusion
下载PDF
Better Visual Image Super-Resolution with Laplacian Pyramid of Generative Adversarial Networks 被引量:2
13
作者 Ming Zhao Xinhong Liu +1 位作者 Xin Yao Kun He 《Computers, Materials & Continua》 SCIE EI 2020年第9期1601-1614,共14页
Although there has been a great breakthrough in the accuracy and speed of super-resolution(SR)reconstruction of a single image by using a convolutional neural network,an important problem remains unresolved:how to res... Although there has been a great breakthrough in the accuracy and speed of super-resolution(SR)reconstruction of a single image by using a convolutional neural network,an important problem remains unresolved:how to restore finer texture details during image super-resolution reconstruction?This paper proposes an Enhanced Laplacian Pyramid Generative Adversarial Network(ELSRGAN),based on the Laplacian pyramid to capture the high-frequency details of the image.By combining Laplacian pyramids and generative adversarial networks,progressive reconstruction of super-resolution images can be made,making model applications more flexible.In order to solve the problem of gradient disappearance,we introduce the Residual-in-Residual Dense Block(RRDB)as the basic network unit.Network capacity benefits more from dense connections,is able to capture more visual features with better reconstruction effects,and removes BN layers to increase calculation speed and reduce calculation complexity.In addition,a loss of content driven by perceived similarity is used instead of content loss driven by spatial similarity,thereby enhancing the visual effect of the super-resolution image,making it more consistent with human visual perception.Extensive qualitative and quantitative evaluation of the baseline datasets shows that the proposed algorithm has higher mean-sort-score(MSS)than any state-of-the-art method and has better visual perception. 展开更多
关键词 Single image super-resolution generative adversarial networks Laplacian pyramid
下载PDF
An Improved Data-Driven Topology Optimization Method Using Feature Pyramid Networks with Physical Constraints 被引量:1
14
作者 Jiaxiang Luo Yu Li +3 位作者 Weien Zhou ZhiqiangGong Zeyu Zhang Wen Yao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期823-848,共26页
Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image ... Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance. 展开更多
关键词 Topology optimization deep learning feature pyramid networks finite element analysis physical constraints
下载PDF
Image decomposition on the basis of an inverse pyramid with 3-layer neural networks 被引量:1
15
作者 Valeriy Victorovich Cherkashyn HE Dong-chen Roumen Kirilov Kountchev 《通讯和计算机(中英文版)》 2009年第11期21-29,共9页
关键词 金字塔分解 图像分解 神经网络 基础 压缩效率 信息技术 自适应方法 图像处理
下载PDF
A Sentence Retrieval Generation Network Guided Video Captioning
16
作者 Ou Ye Mimi Wang +3 位作者 Zhenhua Yu Yan Fu Shun Yi Jun Deng 《Computers, Materials & Continua》 SCIE EI 2023年第6期5675-5696,共22页
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide... Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches. 展开更多
关键词 Video captioning encoder-decoder sentence retrieval external corpus RS GPT-2 network model
下载PDF
Intelligent identification of oceanic eddies in remote sensing data via Dual-Pyramid UNet
17
作者 Nan Zhao Baoxiang Huang +2 位作者 Xinmin Zhang Linyao Ge Ge Chen 《Atmospheric and Oceanic Science Letters》 CSCD 2023年第4期29-36,共8页
海洋涡旋是大洋中重要的组成部分,对海洋能量和物质的输送至关重要.海洋涡旋的检测和表征无论是对于海洋气象学,海洋声学还是海洋生物学等领域都具有重要的研究价值.本文基于UNet架构,并结合金字塔分割注意力(PSA)模块和空洞空间卷积池... 海洋涡旋是大洋中重要的组成部分,对海洋能量和物质的输送至关重要.海洋涡旋的检测和表征无论是对于海洋气象学,海洋声学还是海洋生物学等领域都具有重要的研究价值.本文基于UNet架构,并结合金字塔分割注意力(PSA)模块和空洞空间卷积池化金字塔(ASPP)构造了Dual-Pyramid UNet模型,以平面异常和海表面温度数据中进行海洋涡旋的识别.实验在北大西洋和南大西洋两个涡旋活跃区域进行并选用多个评价指标对识别结果进行评价以证明模型的优异性能. 展开更多
关键词 海洋涡旋识别 深度学习 金字塔分割注意 空洞空间卷积池化金字塔 U型网络架构
下载PDF
基于改进YOLOv5s的轻量级绝缘子缺失检测 被引量:3
18
作者 池小波 张伟杰 +1 位作者 贾新春 续泽晋 《测试技术学报》 2024年第1期19-26,共8页
针对现有绝缘子缺失检测模型计算复杂度高和小目标难以检测等问题,提出一种基于改进的YOLOv5s轻量级检测模型。首先,移除主干网络中的C3模块来减少模型的参数量。其次,在多尺度特征融合网络中引入卷积块注意力机制来提高复杂背景下模型... 针对现有绝缘子缺失检测模型计算复杂度高和小目标难以检测等问题,提出一种基于改进的YOLOv5s轻量级检测模型。首先,移除主干网络中的C3模块来减少模型的参数量。其次,在多尺度特征融合网络中引入卷积块注意力机制来提高复杂背景下模型的特征提取能力。同时,采用加权双向特征金字塔网络结构对特征进行双向跨尺度加权融合,提升网络在遮挡物、相似目标干扰下目标的检测性能。最后,选用SIoU损失函数提升网络的收敛速度和检测精度。实验结果表明,所提模型的平均精准率为96.8%,浮点运算数为2.8 GFLOPS,而原始YOLOv5s在保证97.4%的平均精准率下的浮点运算数为16.3 GFLOPS。相较于原始模型,所提模型对小目标、遮挡目标以及模糊等场景有着较强的鲁棒性,且在保证近似检测精度的同时极大减少了计算量。 展开更多
关键词 绝缘子检测 YOLOv5s模型 卷积块注意力机制 加权双向特征金字塔网络 轻量化网络
下载PDF
结合主动光源和改进YOLOv5s模型的夜间柑橘检测方法 被引量:2
19
作者 熊俊涛 霍钊威 +4 位作者 黄启寅 陈浩然 杨振刚 黄煜华 苏颖苗 《华南农业大学学报》 CAS CSCD 北大核心 2024年第1期97-107,共11页
【目的】解决夜间环境下遮挡和较小柑橘难以准确识别的问题,实现采摘机器人全天候智能化作业。【方法】提出一种结合主动光源的夜间柑橘识别方法。首先,通过分析主动光源下颜色特征不同的夜间柑橘图像,选择最佳的光源色并进行图像采集... 【目的】解决夜间环境下遮挡和较小柑橘难以准确识别的问题,实现采摘机器人全天候智能化作业。【方法】提出一种结合主动光源的夜间柑橘识别方法。首先,通过分析主动光源下颜色特征不同的夜间柑橘图像,选择最佳的光源色并进行图像采集。然后,提出一种夜间柑橘检测模型BI-YOLOv5s,该模型采用双向特征金字塔网络(Bi-FPN)进行多尺度交叉连接和加权特征融合,提高对遮挡和较小果实的识别能力;引入Coordinate attention(CA)注意力机制模块,进一步加强对目标位置信息的提取;采用融入Transformer结构的C3TR模块,在减少计算量的同时更好地提取全局信息。【结果】本文提出的BI-YOLOv5s模型在测试集上的精准率、召回率、平均准确率分别为93.4%、92.2%和97.1%,相比YOLOv5s模型分别提升了3.2、1.5和2.3个百分点。在所采用的光源色环境下,模型对夜间柑橘识别的正确率为95.3%,相比白光环境下提高了10.4个百分点。【结论】本文提出的方法对夜间环境下遮挡和小目标柑橘的识别具有较高的准确性,可为夜间果蔬智能化采摘的视觉精准识别提供技术支持。 展开更多
关键词 柑橘 夜间检测 主动光源 双向特征金字塔网络 YOLOv5s HSV颜色空间
下载PDF
基于YOLOv5s和超声图像的儿童肠套叠特征检测模型 被引量:1
20
作者 陈星 俞凯 +2 位作者 袁贞明 黄坚 李哲明 《杭州师范大学学报(自然科学版)》 CAS 2024年第1期10-19,共10页
为帮助医生快速寻找到儿童腹部超声中肠套叠的病变特征并实现肠套叠超声诊后数据的快速质检,文章将目标检测算法应用于儿童腹部超声图像检测肠套叠“同心圆”征.首先探索了基于YOLOv5s的儿童肠套叠检测模型,发现该模型检测肠套叠“同心... 为帮助医生快速寻找到儿童腹部超声中肠套叠的病变特征并实现肠套叠超声诊后数据的快速质检,文章将目标检测算法应用于儿童腹部超声图像检测肠套叠“同心圆”征.首先探索了基于YOLOv5s的儿童肠套叠检测模型,发现该模型检测肠套叠“同心圆”征的精确度、召回率、F 1分数、mAP@0.5、FPS以及参数量等方面均优于Faster RCNN.进一步,为解决肉眼难以观察的“同心圆”征的检测问题,使用双向特征金字塔网络,并将注意力机制加入YOLOv5s网络,形成基于YOLOv5s_BiFPN_SE框架的儿童肠套叠“同心圆”征检测模型.该模型检测的精确率、召回率、F 1分数、mAP@0.5分别达到了91.33%、90.73%、91.03%、88.77%,性能更优于YOLOv5s. 展开更多
关键词 目标检测 肠套叠 超声图像 “同心圆”征 双向特征金字塔网络 注意力机制
下载PDF
上一页 1 2 44 下一页 到第
使用帮助 返回顶部