期刊文献+
共找到3,399篇文章
< 1 2 170 >
每页显示 20 50 100
CMMCAN:Lightweight Feature Extraction and Matching Network for Endoscopic Images Based on Adaptive Attention
1
作者 Nannan Chong Fan Yang 《Computers, Materials & Continua》 SCIE EI 2024年第8期2761-2783,共23页
In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clini... In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness. 展开更多
关键词 feature extraction and matching lightweighted network medical images ENDOSCOPIC attention
下载PDF
Attention Guided Multi Scale Feature Fusion Network for Automatic Prostate Segmentation
2
作者 Yuchun Li Mengxing Huang +1 位作者 Yu Zhang Zhiming Bai 《Computers, Materials & Continua》 SCIE EI 2024年第2期1649-1668,共20页
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta... The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation. 展开更多
关键词 Prostate segmentation multi-scale attention 3D Transformer feature fusion MRI
下载PDF
Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
3
作者 Gonghui Deng Dunzhi Wu Weizhen Chen 《Computers, Materials & Continua》 SCIE EI 2024年第8期1985-2003,共19页
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula... The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field. 展开更多
关键词 Fine-grained image recognition food image recognition attention mechanism local feature fusion
下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection
4
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
下载PDF
Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module
5
作者 胡振涛 HU Chonghao +1 位作者 YANG Haoran SHUAI Weiwei 《High Technology Letters》 EI CAS 2024年第1期23-30,共8页
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera... The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable. 展开更多
关键词 multi-modal image translation generative adversarial network(GAN) squeezeand-excitation(SE)mechanism feature attention(FA)module
下载PDF
Residual Feature Attentional Fusion Network for Lightweight Chest CT Image Super-Resolution 被引量:1
6
作者 Kun Yang Lei Zhao +4 位作者 Xianghui Wang Mingyang Zhang Linyan Xue Shuang Liu Kun Liu 《Computers, Materials & Continua》 SCIE EI 2023年第6期5159-5176,共18页
The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study s... The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study super-resolution(SR)algorithms applied to CT images to improve the reso-lution of CT images.However,most of the existing SR algorithms are studied based on natural images,which are not suitable for medical images;and most of these algorithms improve the reconstruction quality by increasing the network depth,which is not suitable for machines with limited resources.To alleviate these issues,we propose a residual feature attentional fusion network for lightweight chest CT image super-resolution(RFAFN).Specifically,we design a contextual feature extraction block(CFEB)that can extract CT image features more efficiently and accurately than ordinary residual blocks.In addition,we propose a feature-weighted cascading strategy(FWCS)based on attentional feature fusion blocks(AFFB)to utilize the high-frequency detail information extracted by CFEB as much as possible via selectively fusing adjacent level feature information.Finally,we suggest a global hierarchical feature fusion strategy(GHFFS),which can utilize the hierarchical features more effectively than dense concatenation by progressively aggregating the feature information at various levels.Numerous experiments show that our method performs better than most of the state-of-the-art(SOTA)methods on the COVID-19 chest CT dataset.In detail,the peak signal-to-noise ratio(PSNR)is 0.11 dB and 0.47 dB higher on CTtest1 and CTtest2 at×3 SR compared to the suboptimal method,but the number of parameters and multi-adds are reduced by 22K and 0.43G,respectively.Our method can better recover chest CT image quality with fewer computational resources and effectively assist in COVID-19. 展开更多
关键词 SUPER-RESOLUTION COVID-19 chest CT lightweight network contextual feature extraction attentional feature fusion
下载PDF
Bridge Crack Segmentation Method Based on Parallel Attention Mechanism and Multi-Scale Features Fusion 被引量:1
7
作者 Jianwei Yuan Xinli Song +2 位作者 Huaijian Pu Zhixiong Zheng Ziyang Niu 《Computers, Materials & Continua》 SCIE EI 2023年第3期6485-6503,共19页
Regular inspection of bridge cracks is crucial to bridge maintenance and repair.The traditional manual crack detection methods are timeconsuming,dangerous and subjective.At the same time,for the existing mainstream vi... Regular inspection of bridge cracks is crucial to bridge maintenance and repair.The traditional manual crack detection methods are timeconsuming,dangerous and subjective.At the same time,for the existing mainstream vision-based automatic crack detection algorithms,it is challenging to detect fine cracks and balance the detection accuracy and speed.Therefore,this paper proposes a new bridge crack segmentationmethod based on parallel attention mechanism and multi-scale features fusion on top of the DeeplabV3+network framework.First,the improved lightweight MobileNetv2 network and dilated separable convolution are integrated into the original DeeplabV3+network to improve the original backbone network Xception and atrous spatial pyramid pooling(ASPP)module,respectively,dramatically reducing the number of parameters in the network and accelerates the training and prediction speed of the model.Moreover,we introduce the parallel attention mechanism into the encoding and decoding stages.The attention to the crack regions can be enhanced from the aspects of both channel and spatial parts and significantly suppress the interference of various noises.Finally,we further improve the detection performance of the model for fine cracks by introducing a multi-scale features fusion module.Our research results are validated on the self-made dataset.The experiments show that our method is more accurate than other methods.Its intersection of union(IoU)and F1-score(F1)are increased to 77.96%and 87.57%,respectively.In addition,the number of parameters is only 4.10M,which is much smaller than the original network;also,the frames per second(FPS)is increased to 15 frames/s.The results prove that the proposed method fits well the requirements of rapid and accurate detection of bridge cracks and is superior to other methods. 展开更多
关键词 Crack detection DeeplabV3+ parallel attention mechanism feature fusion
下载PDF
MVCE-Net: Multi-View Region Feature and Caption Enhancement Co-Attention Network for Visual Question Answering
8
作者 Feng Yan Wushouer Silamu Yanbing Li 《Computers, Materials & Continua》 SCIE EI 2023年第7期65-80,共16页
Visual question answering(VQA)requires a deep understanding of images and their corresponding textual questions to answer questions about images more accurately.However,existing models tend to ignore the implicit know... Visual question answering(VQA)requires a deep understanding of images and their corresponding textual questions to answer questions about images more accurately.However,existing models tend to ignore the implicit knowledge in the images and focus only on the visual information in the images,which limits the understanding depth of the image content.The images contain more than just visual objects,some images contain textual information about the scene,and slightly more complex images contain relationships between individual visual objects.Firstly,this paper proposes a model using image description for feature enhancement.This model encodes images and their descriptions separately based on the question-guided coattention mechanism.This mechanism increases the feature representation of the model,enhancing the model’s ability for reasoning.In addition,this paper improves the bottom-up attention model by obtaining two image region features.After obtaining the two visual features and the spatial position information corresponding to each feature,concatenating the two features as the final image feature can better represent an image.Finally,the obtained spatial position information is processed to enable the model to perceive the size and relative position of each object in the image.Our best single model delivers a 74.16%overall accuracy on the VQA 2.0 dataset,our model even outperforms some multi-modal pre-training models with fewer images and a shorter time. 展开更多
关键词 Bottom-up attention spatial position relationship region feature self-attention
下载PDF
Human Visual Attention Mechanism-Inspired Point-and-Line Stereo Visual Odometry for Environments with Uneven Distributed Features
9
作者 Chang Wang Jianhua Zhang +2 位作者 Yan Zhao Youjie Zhou Jincheng Jiang 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2023年第3期191-204,共14页
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist... Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism. 展开更多
关键词 Visual odometry Human visual attention mechanism Environmental adaptability Uneven distributed features
下载PDF
基于改进蜣螂算法优化CNN-BiLSTM-Attention的串联电弧故障检测方法
10
作者 李海波 《电器与能效管理技术》 2024年第8期57-68,共12页
针对故障电弧特征提取不足、检测精度不高等问题,提出一种多特征融合的改进蜣螂算法(IDBO)优化融合注意力(Attention)机制的卷积神经网络(CNN)和双向长短期记忆(BiLSTM)神经网络的串联电弧故障检测方法。通过实验平台提取电流的时域、... 针对故障电弧特征提取不足、检测精度不高等问题,提出一种多特征融合的改进蜣螂算法(IDBO)优化融合注意力(Attention)机制的卷积神经网络(CNN)和双向长短期记忆(BiLSTM)神经网络的串联电弧故障检测方法。通过实验平台提取电流的时域、频域、时频域以及信号自回归参数模型特征;利用核主成分分析(KPCA)对特征进行降维融合,并将求取的特征向量作为CNN-BiLSTM-Attention的输入向量;引入Cubic混沌映射、螺旋搜索策略、动态权重系数、高斯柯西变异策略对蜣螂算法进行改进,利用改进蜣螂算法对CNN-BiLSTM-Attention超参数优化实现串联电弧故障诊断。结果表明,所提方法故障电弧检测准确率达到97.92%,可高效识别串联电弧故障。 展开更多
关键词 电弧故障 改进蜣螂算法 多特征融合 CNN-BiLSTM-attention
下载PDF
基于Attention-BiTCN的网络入侵检测方法 被引量:4
11
作者 孙红哲 王坚 +1 位作者 王鹏 安雨龙 《信息网络安全》 CSCD 北大核心 2024年第2期309-318,共10页
为解决网络入侵检测领域多分类准确率不高的问题,文章根据网络流量数据具有时序特征的特点,提出一种基于注意力机制和双向时间卷积神经网络(BiDirectional Temporal Convolutional Network,BiTCN)的网络入侵检测模型。首先,该模型对数... 为解决网络入侵检测领域多分类准确率不高的问题,文章根据网络流量数据具有时序特征的特点,提出一种基于注意力机制和双向时间卷积神经网络(BiDirectional Temporal Convolutional Network,BiTCN)的网络入侵检测模型。首先,该模型对数据集进行独热编码和归一化处置等预处理,解决网络流量数据离散性强和标度不统一的问题;其次,将预处理好的数据经双向滑窗法生成双向序列,并同步输入Attention-Bi TCN模型中;然后,提取双向时序特征并通过加性方式融合,得到时序信息被增强后的融合特征;最后,使用Softmax函数对融合特征进行多种攻击行为检测识别。文章所提模型在NSL-KDD和UNSW-NB15数据集上进行实验验证,多分类准确率分别达到99.70%和84.07%,优于传统网络入侵检测算法,且比其他深度学习模型在检测性能上有显著提升。 展开更多
关键词 入侵检测 注意力机制 BiTCN 双向滑窗法 融合特征
下载PDF
Point Cloud Classification Using Content-Based Transformer via Clustering in Feature Space 被引量:2
12
作者 Yahui Liu Bin Tian +2 位作者 Yisheng Lv Lingxi Li Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期231-239,共9页
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est... Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT. 展开更多
关键词 Content-based Transformer deep learning feature aggregator local attention point cloud classification
下载PDF
基于音频动态特征重组和TCN-Attention的电力变压器故障诊断方法 被引量:2
13
作者 叶李敏 李敬兆 《兰州文理学院学报(自然科学版)》 2024年第1期82-88,共7页
电力变压器是电力系统的重要电气设备之一,对变压器进行在线故障诊断是降低电力系统运维成本、提高电力系统稳定性的关键措施.基于电力变压器运行时的音频信息,提出音频信号动态特征重组和TCN-Attention模型实现变压器典型故障的精准识... 电力变压器是电力系统的重要电气设备之一,对变压器进行在线故障诊断是降低电力系统运维成本、提高电力系统稳定性的关键措施.基于电力变压器运行时的音频信息,提出音频信号动态特征重组和TCN-Attention模型实现变压器典型故障的精准识别.首先,分析变压器音频信号的SRA、RMS、峭度和裕度特征;然后,根据特征与变压器故障之间的相关性、鲁棒性和时序单调性实现不同特征的加权融合,得到变压器音频信号的综合特征;最后,设计TCN-Attention模型分析变压器音频特征从而实现故障诊断,并基于注意力机制增强音频特征中的重要信息,以提升变压器故障的识别准确率.本文采集了变压器在正常运行、绕组故障和铁芯故障3种状态下的音频信号构成数据集,对所提方法进行验证.实验结果表明,本文方法根据变压器音频信号进行故障诊断的准确率可达90%以上,实现了变压器故障的智能诊断,对保障电力系统稳定运行具有重要意义. 展开更多
关键词 电力变压器 音频信号 故障诊断 动态特征重组 TCN-attention
下载PDF
AFBNet: A Lightweight Adaptive Feature Fusion Module for Super-Resolution Algorithms
14
作者 Lirong Yin Lei Wang +7 位作者 Siyu Lu Ruiyang Wang Haitao Ren Ahmed AlSanad Salman A.AlQahtani Zhengtong Yin Xiaolu Li Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2315-2347,共33页
At present,super-resolution algorithms are employed to tackle the challenge of low image resolution,but it is difficult to extract differentiated feature details based on various inputs,resulting in poor generalizatio... At present,super-resolution algorithms are employed to tackle the challenge of low image resolution,but it is difficult to extract differentiated feature details based on various inputs,resulting in poor generalization ability.Given this situation,this study first analyzes the features of some feature extraction modules of the current super-resolution algorithm and then proposes an adaptive feature fusion block(AFB)for feature extraction.This module mainly comprises dynamic convolution,attention mechanism,and pixel-based gating mechanism.Combined with dynamic convolution with scale information,the network can extract more differentiated feature information.The introduction of a channel spatial attention mechanism combined with multi-feature fusion further enables the network to retain more important feature information.Dynamic convolution and pixel-based gating mechanisms enhance the module’s adaptability.Finally,a comparative experiment of a super-resolution algorithm based on the AFB module is designed to substantiate the efficiency of the AFB module.The results revealed that the network combined with the AFB module has stronger generalization ability and expression ability. 展开更多
关键词 SUPER-RESOLUTION feature extraction dynamic convolution attention mechanism gate control
下载PDF
Behaviour recognition based on the integration of multigranular motion features in the Internet of Things
15
作者 Lizong Zhang Yiming Wang +3 位作者 Ke Yan Yi Su Nawaf Alharbe Shuxin Feng 《Digital Communications and Networks》 SCIE CSCD 2024年第3期666-675,共10页
With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analy... With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analysis tasks such as behaviour recognition.These applications have dramatically increased the diversity of IoT systems.Specifically,behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension.Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions,in contrast to computer vision tasks involving images that focus on understanding spatial information.However,current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos.In this paper,we propose a novel behaviour recognition method based on the integration of multigranular(IMG)motion features,which can provide support for deploying video analysis in multimedia IoT crowdsensing systems.In particular,we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module(CSEM)and a cascaded long-term motion feature integration module(CLIM).We evaluate our model on several action recognition benchmarks,such as HMDB51,Something-Something and UCF101.The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods,which confirms its effective-ness and efficiency. 展开更多
关键词 Behaviour recognition Motion features attention mechanism Internet of things Crowdsensing
下载PDF
An Attention-Based Approach to Enhance the Detection and Classification of Android Malware
16
作者 Abdallah Ghourabi 《Computers, Materials & Continua》 SCIE EI 2024年第8期2743-2760,共18页
The dominance of Android in the global mobile market and the open development characteristics of this platform have resulted in a significant increase in malware.These malicious applications have become a serious conc... The dominance of Android in the global mobile market and the open development characteristics of this platform have resulted in a significant increase in malware.These malicious applications have become a serious concern to the security of Android systems.To address this problem,researchers have proposed several machine-learning models to detect and classify Android malware based on analyzing features extracted from Android samples.However,most existing studies have focused on the classification task and overlooked the feature selection process,which is crucial to reduce the training time and maintain or improve the classification results.The current paper proposes a new Android malware detection and classification approach that identifies the most important features to improve classification performance and reduce training time.The proposed approach consists of two main steps.First,a feature selection method based on the Attention mechanism is used to select the most important features.Then,an optimized Light Gradient Boosting Machine(LightGBM)classifier is applied to classify the Android samples and identify the malware.The feature selection method proposed in this paper is to integrate an Attention layer into a multilayer perceptron neural network.The role of the Attention layer is to compute the weighted values of each feature based on its importance for the classification process.Experimental evaluation of the approach has shown that combining the Attention-based technique with an optimized classification algorithm for Android malware detection has improved the accuracy from 98.64%to 98.71%while reducing the training time from 80 to 28 s. 展开更多
关键词 Android malware malware detection feature selection attention mechanism LightGBM mobile security
下载PDF
Learning Epipolar Line Window Attention for Stereo Image Super-Resolution Reconstruction
17
作者 Xue Li Hongying Zhang +1 位作者 Zixun Ye Xiaoru 《Computers, Materials & Continua》 SCIE EI 2024年第2期2847-2864,共18页
Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not... Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not consider the offset of pixels along the epipolar lines in complementary views when integrating stereo information.To address these challenges,this paper introduces a novel epipolar line window attention stereo image super-resolution network(EWASSR).For detail feature restoration,we design a feature extractor based on Transformer and convolutional neural network(CNN),which consists of(shifted)window-based self-attention((S)W-MSA)and feature distillation and enhancement blocks(FDEB).This combination effectively solves the problem of global image perception and local feature attention and captures more discriminative high-frequency features of the image.Furthermore,to address the problem of offset of complementary pixels in stereo images,we propose an epipolar line window attention(EWA)mechanism,which divides windows along the epipolar direction to promote efficient matching of shifted pixels,even in pixel smooth areas.More accurate pixel matching can be achieved using adjacent pixels in the window as a reference.Extensive experiments demonstrate that our EWASSR can reconstruct more realistic detailed features.Comparative quantitative results show that in the experimental results of our EWASSR on the Middlebury and Flickr1024 data sets for 2×SR,compared with the recent network,the Peak signal-to-noise ratio(PSNR)increased by 0.37 dB and 0.34 dB,respectively. 展开更多
关键词 Stereo SR epipolar line window attention feature distillation
下载PDF
融合MacBERT和Talking⁃Heads Attention实体关系联合抽取模型
18
作者 王春亮 姚洁仪 李昭 《现代电子技术》 北大核心 2024年第5期127-131,共5页
针对现有的医学文本关系抽取任务模型在训练过程中存在语义理解能力不足,可能导致关系抽取的效果不尽人意的问题,文中提出一种融合MacBERT和Talking⁃Heads Attention的实体关系联合抽取模型。该模型首先利用MacBERT语言模型来获取动态... 针对现有的医学文本关系抽取任务模型在训练过程中存在语义理解能力不足,可能导致关系抽取的效果不尽人意的问题,文中提出一种融合MacBERT和Talking⁃Heads Attention的实体关系联合抽取模型。该模型首先利用MacBERT语言模型来获取动态字向量表达,MacBERT作为改进的BERT模型,能够减少预训练和微调阶段之间的差异,从而提高模型的泛化能力;然后,将这些动态字向量表达输入到双向门控循环单元(BiGRU)中,以便提取文本的上下文特征。BiGRU是一种改进的循环神经网络(RNN),具有更好的长期依赖捕获能力。在获取文本上下文特征之后,使用Talking⁃Heads Attention来获取全局特征。Talking⁃Heads Attention是一种自注意力机制,可以捕获文本中不同位置之间的关系,从而提高关系抽取的准确性。实验结果表明,与实体关系联合抽取模型GRTE相比,该模型F1值提升1%,precision值提升0.4%,recall值提升1.5%。 展开更多
关键词 MacBERT BiGRU 关系抽取 医学文本 Talking⁃Heads attention 深度学习 全局特征 神经网络
下载PDF
Multi-Scale Mixed Attention Tea Shoot Instance Segmentation Model
19
作者 Dongmei Chen Peipei Cao +5 位作者 Lijie Yan Huidong Chen Jia Lin Xin Li Lin Yuan Kaihua Wu 《Phyton-International Journal of Experimental Botany》 SCIE 2024年第2期261-275,共15页
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often... Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales. 展开更多
关键词 Tea shoots attention mechanism multi-scale feature extraction instance segmentation deep learning
下载PDF
Few-shot image recognition based on multi-scale features prototypical network
20
作者 LIU Jiatong DUAN Yong 《High Technology Letters》 EI CAS 2024年第3期280-289,共10页
In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract i... In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract image features and project them into a feature space,thus evaluating the similarity between samples based on their relative distances within the metric space.To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data vol-ume,a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction.Additionally,the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment.The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space.Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011.The method in this paper can achieve higher classification accuracy.Specifically,in the 5-way 1-shot experiment,classification accuracy reaches 50.13%and 66.79%respectively on these two datasets.Moreover,in the 5-way 5-shot ex-periment,accuracy of 66.79%and 85.91%are observed,respectively. 展开更多
关键词 few-shot learning multi-scale feature prototypical network channel attention label-smoothing
下载PDF
上一页 1 2 170 下一页 到第
使用帮助 返回顶部