期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
Multi-Layer Feature Extraction with Deformable Convolution for Fabric Defect Detection
1
作者 Jielin Jiang Chao Cui +1 位作者 Xiaolong Xu Yan Cui 《Intelligent Automation & Soft Computing》 2024年第4期725-744,共20页
In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.... In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.Traditional fabric defect detection algorithms can only detect specific materials and specific fabric defect types;in addition,their detection efficiency is low,and their detection results are relatively poor.Deep learning-based methods have many advantages in the field of fabric defect detection,however,such methods are less effective in identifying multiscale fabric defects and defects with complex shapes.Therefore,we propose an effective algorithm,namely multilayer feature extraction combined with deformable convolution(MFDC),for fabric defect detection.In MFDC,multi-layer feature extraction is used to fuse the underlying location features with high-level classification features through a horizontally connected top-down architecture to improve the detection of multi-scale fabric defects.On this basis,a deformable convolution is added to solve the problem of the algorithm’s weak detection ability of irregularly shaped fabric defects.In this approach,Roi Align and Cascade-RCNN are integrated to enhance the adaptability of the algorithm in materials with complex patterned backgrounds.The experimental results show that the MFDC algorithm can achieve good detection results for both multi-scale fabric defects and defects with complex shapes,at the expense of a small increase in detection time. 展开更多
关键词 Fabric defect detection multi-layer features deformable convolution
下载PDF
Detection Algorithm of Laboratory Personnel Irregularities Based on Improved YOLOv7
2
作者 Yongliang Yang Linghua Xu +2 位作者 Maolin Luo Xiao Wang Min Cao 《Computers, Materials & Continua》 SCIE EI 2024年第2期2741-2765,共25页
Due to the complex environment of the university laboratory,personnel flow intensive,personnel irregular behavior is easy to cause security risks.Monitoring using mainstream detection algorithms suffers from low detec... Due to the complex environment of the university laboratory,personnel flow intensive,personnel irregular behavior is easy to cause security risks.Monitoring using mainstream detection algorithms suffers from low detection accuracy and slow speed.Therefore,the current management of personnel behavior mainly relies on institutional constraints,education and training,on-site supervision,etc.,which is time-consuming and ineffective.Given the above situation,this paper proposes an improved You Only Look Once version 7(YOLOv7)to achieve the purpose of quickly detecting irregular behaviors of laboratory personnel while ensuring high detection accuracy.First,to better capture the shape features of the target,deformable convolutional networks(DCN)is used in the backbone part of the model to replace the traditional convolution to improve the detection accuracy and speed.Second,to enhance the extraction of important features and suppress useless features,this paper proposes a new convolutional block attention module_efficient channel attention(CBAM_E)for embedding the neck network to improve the model’s ability to extract features from complex scenes.Finally,to reduce the influence of angle factor and bounding box regression accuracy,this paper proposes a newα-SCYLLA intersection over union(α-SIoU)instead of the complete intersection over union(CIoU),which improves the regression accuracy while increasing the convergence speed.Comparison experiments on public and homemade datasets show that the improved algorithm outperforms the original algorithm in all evaluation indexes,with an increase of 2.92%in the precision rate,4.14%in the recall rate,0.0356 in the weighted harmonic mean,3.60%in the mAP@0.5 value,and a reduction in the number of parameters and complexity.Compared with the mainstream algorithm,the improved algorithm has higher detection accuracy,faster convergence speed,and better actual recognition effect,indicating the effectiveness of the improved algorithm in this paper and its potential for practical application in laboratory scenarios. 展开更多
关键词 University laboratory personnel behavior YOLOv7 deformable convolutional networks attention module intersection over union
下载PDF
A Deformable Network with Attention Mechanism for Retinal Vessel Segmentation
3
作者 Xiaolong Zhu Wenjian Li +2 位作者 Weihang Zhang Dongwei Li Huiqi Li 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期186-193,共8页
The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segm... The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segmentation networks fail to extract features in fundus image sufficiently,we propose a novel network(DSeU-net)based on deformable convolution and squeeze excitation residual module.The deformable convolution is utilized to dynamically adjust the receptive field for the feature extraction of retinal vessel.And the squeeze excitation residual module is used to scale the weights of the low-level features so that the network learns the complex relationships of the different feature layers efficiently.We validate the DSeU-net on three public retinal vessel segmentation datasets including DRIVE,CHASEDB1,and STARE,and the experimental results demonstrate the satisfactory segmentation performance of the network. 展开更多
关键词 retinal vessel segmentation deformable convolution attention mechanism deep learning
下载PDF
Pre-training transformer with dual-branch context content module for table detection in document images
4
作者 Yongzhi LI Pengle ZHANG +2 位作者 Meng SUN Jin HUANG Ruhan HE 《虚拟现实与智能硬件(中英文)》 EI 2024年第5期408-420,共13页
Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such... Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM. 展开更多
关键词 Table detection Document image analysis TRANSFORMER Dilated convolution Deformable convolution Feature fusion
下载PDF
A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields 被引量:12
5
作者 Zongcheng ZUO Wen ZHANG Dongying ZHANG 《Journal of Geodesy and Geoinformation Science》 2020年第3期39-49,共11页
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a... Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset. 展开更多
关键词 high-resolution remote sensing image semantic segmentation deformable convolution network conditions random fields
下载PDF
Research on Facial Fatigue Detection of Drivers with Multi-feature Fusion 被引量:1
6
作者 YE Yuxuan ZHOU Xianchun +2 位作者 WANG Wenyan YANG Chuanbin ZOU Qingyu 《Instrumentation》 2023年第1期23-31,共9页
In order to solve the shortcomings of current fatigue detection methods such as low accuracy or poor real-time performance,a fatigue detection method based on multi-feature fusion is proposed.Firstly,the HOG face dete... In order to solve the shortcomings of current fatigue detection methods such as low accuracy or poor real-time performance,a fatigue detection method based on multi-feature fusion is proposed.Firstly,the HOG face detection algorithm and KCF target tracking algorithm are integrated and deformable convolutional neural network is introduced to identify the state of extracted eyes and mouth,fast track the detected faces and extract continuous and stable target faces for more efficient extraction.Then the head pose algorithm is introduced to detect the driver’s head in real time and obtain the driver’s head state information.Finally,a multi-feature fusion fatigue detection method is proposed based on the state of the eyes,mouth and head.According to the experimental results,the proposed method can detect the driver’s fatigue state in real time with high accuracy and good robustness compared with the current fatigue detection algorithms. 展开更多
关键词 HOG Face Posture Detection Deformable convolution Multi-feature Fusion Fatigue Detection
下载PDF
HSCA-Net: A Hybrid Spatial-Channel Attention Network in Multiscale Feature Pyramid for Document Layout Analysis 被引量:1
7
作者 Honghong Zhang Canhui Xu +3 位作者 Cao Shi Henyue Bi Yuteng Li Sami Mian 《Journal of Artificial Intelligence and Technology》 2023年第1期10-17,共8页
Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neu... Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neural networks(CNNs)are adopted as the feature extraction networks.In this paper,a hybrid spatial-channel attention network(HSCA-Net)is proposed to improve feature extraction capability by introducing attention mechanism to explore more salient properties within document pages.The HSCA-Net consists of spatial attention module(SAM),channel attention module(CAM),and designed lateral attention connection.CAM adaptively adjusts channel feature responses by emphasizing selective information,which depends on the contribution of the features of each channel.SAM guides CNNs to focus on the informative contents and capture global context information among page objects.The lateral attention connection incorporates SAM and CAM into multiscale feature pyramid network,and thus retains original feature information.The effectiveness and adaptability of HSCA-Net are evaluated through multiple experiments on publicly available datasets such as PubLayNet,ICDAR-POD,and Article Regions.Experimental results demonstrate that HSCA-Net achieves state-of-the-art performance on document layout analysis task. 展开更多
关键词 layout analysis attention mechanism deep learning deformable convolution
下载PDF
DT-Net:Joint Dual-Input Transformer and CNN for Retinal Vessel Segmentation
8
作者 Wenran Jia Simin Ma +1 位作者 Peng Geng Yan Sun 《Computers, Materials & Continua》 SCIE EI 2023年第9期3393-3411,共19页
Retinal vessel segmentation in fundus images plays an essential role in the screening,diagnosis,and treatment of many diseases.The acquired fundus images generally have the following problems:uneven illumination,high ... Retinal vessel segmentation in fundus images plays an essential role in the screening,diagnosis,and treatment of many diseases.The acquired fundus images generally have the following problems:uneven illumination,high noise,and complex structure.It makes vessel segmentation very challenging.Previous methods of retinal vascular segmentation mainly use convolutional neural networks on U Network(U-Net)models,and they have many limitations and shortcomings,such as the loss of microvascular details at the end of the vessels.We address the limitations of convolution by introducing the transformer into retinal vessel segmentation.Therefore,we propose a hybrid method for retinal vessel segmentation based on modulated deformable convolution and the transformer,named DT-Net.Firstly,multi-scale image features are extracted by deformable convolution and multi-head selfattention(MHSA).Secondly,image information is recovered,and vessel morphology is refined by the proposed transformer decoder block.Finally,the local prediction results are obtained by the side output layer.The accuracy of the vessel segmentation is improved by the hybrid loss function.Experimental results show that our method obtains good segmentation performance on Specificity(SP),Sensitivity(SE),Accuracy(ACC),Curve(AUC),and F1-score on three publicly available fundus datasets such as DRIVE,STARE,and CHASE_DB1. 展开更多
关键词 Retinal vessel segmentation deformable convolution MULTI-SCALE TRANSFORMER hybrid loss function
下载PDF
RealFuVSR:Feature enhanced real-world video super-resolution
9
作者 Zhi LI Xiongwen PANG +1 位作者 Yiyue JIANG Yujie WANG 《Virtual Reality & Intelligent Hardware》 EI 2023年第6期523-537,共15页
Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead t... Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models. 展开更多
关键词 Video super-resolution Deformable convolution Cascade residual upsampling Second-order degradation Multi-scale feature extraction
下载PDF
Method to Appraise Dangerous Class of Building Masonry Component Based on DC-YOLO Model
10
作者 Hongrui Zhang Wenxue Wei +2 位作者 Xinguang Xiao Song Yang Wanlu Shao 《Computers, Materials & Continua》 SCIE EI 2020年第4期457-468,共12页
This DC-YOLO Model was designed in order to improve the efficiency for appraising dangerous class of buildings and avoid manual intervention,thereby making the appraisal results more objective.It is an automated metho... This DC-YOLO Model was designed in order to improve the efficiency for appraising dangerous class of buildings and avoid manual intervention,thereby making the appraisal results more objective.It is an automated method designed based on deep learning and target detection algorithms to appraise the dangerous class of building masonry component.Specifically,it(1)adopted K-means clustering to obtain the quantity and size of the prior boxes;(2)expanded the grid size to improve identification to small targets;(3)introduced in deformable convolution to adapt to the irregular shape of the masonry component cracks.The experimental results show that,comparing with the conventional method,the DC-YOLO model has better recognition rates for various targets to different extents,and achieves good effects in precision,recall rate and F1 value,which indicates the good performance in classifying dangerous classes of building masonry component. 展开更多
关键词 Deep learning masonry component appraisal of dangerous class deformable convolution
下载PDF
DSD-MatchingNet:Deformable sparse-to-dense feature matching for learning accurate correspondences
11
作者 Yicheng ZHAO Han ZHANG +3 位作者 Ping LU Ping LI Enhua WU Bin SHENG 《Virtual Reality & Intelligent Hardware》 2022年第5期432-443,共12页
Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust a... Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset. 展开更多
关键词 Image matching Deformable convolution network Sparse-to-dense matching
下载PDF
Space-time video super-resolution using long-term temporal feature aggregation
12
作者 Kuanhao Chen Zijie Yue Miaojing Shi 《Autonomous Intelligent Systems》 EI 2023年第1期75-83,共9页
Space-time video super-resolution(STVSR)serves the purpose to reconstruct high-resolution high-frame-rate videos from their low-resolution low-frame-rate counterparts.Recent approaches utilize end-to-end deep learning... Space-time video super-resolution(STVSR)serves the purpose to reconstruct high-resolution high-frame-rate videos from their low-resolution low-frame-rate counterparts.Recent approaches utilize end-to-end deep learning models to achieve STVSR.They first interpolate intermediate frame features between given frames,then perform local and global refinement among the feature sequence,and finally increase the spatial resolutions of these features.However,in the most important feature interpolation phase,they only capture spatial-temporal information from the most adjacent frame features,ignoring modelling long-term spatial-temporal correlations between multiple neighbouring frames to restore variable-speed object movements and maintain long-term motion continuity.In this paper,we propose a novel long-term temporal feature aggregation network(LTFA-Net)for STVSR.Specifically,we design a long-term mixture of experts(LTMoE)module for feature interpolation.LTMoE contains multiple experts to extract mutual and complementary spatial-temporal information from multiple consecutive adjacent frame features,which are then combined with different weights to obtain interpolation results using several gating nets.Next,we perform local and global feature refinement using the Locally-temporal Feature Comparison(LFC)module and bidirectional deformable ConvLSTM layer,respectively.Experimental results on two standard benchmarks,Adobe240 and GoPro,indicate the effectiveness and superiority of our approach over state of the art. 展开更多
关键词 Space-time video super-resolution Mixture of experts Deformable convolutional layer Long-term temporal feature aggregation
原文传递
基于双支路特征融合的MRI颅脑肿瘤图像分割研究 被引量:2
13
作者 熊炜 周蕾 +2 位作者 乐玲 张开 李利荣 《光电子.激光》 CAS CSCD 北大核心 2022年第4期383-392,共10页
针对磁共振成像(magnetic resonance imaging, MRI)颅脑肿瘤区域误识别与分割网络空间信息丢失问题,提出一种基于双支路特征融合的MRI脑肿瘤图像分割方法。首先通过主支路的重构VGG与注意力模型(re-parameterization visual geometry gr... 针对磁共振成像(magnetic resonance imaging, MRI)颅脑肿瘤区域误识别与分割网络空间信息丢失问题,提出一种基于双支路特征融合的MRI脑肿瘤图像分割方法。首先通过主支路的重构VGG与注意力模型(re-parameterization visual geometry group and attention model, RVAM)提取网络的上下文信息,然后使用可变形卷积与金字塔池化模型(deformable convolution and pyramid pooling model, DCPM)在副支路获取丰富的空间信息,之后使用特征融合模块对两支路的特征信息进行融合。最后引入注意力模型,在上采样过程中加强分割目标在解码时的权重。提出的方法在Kaggle_3m数据集和BraTS2019数据集上进行了实验验证,实验结果表明该方法具有良好的脑肿瘤分割性能,其中在Kaggle_3m上,Dice相似系数、杰卡德系数分别达到了91.45%和85.19%。 展开更多
关键词 磁共振成像(magnetic resonance imaging MRI)颅脑肿瘤图像分割 双支路特征融合 重构VGG与注意力模型(re-parameterization visual geometry group and attention model RVAM) 可变形卷积与金字塔池化模型(deformable convolution and pyramid pooling model DCPM)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部