期刊文献+
共找到300篇文章
< 1 2 15 >
每页显示 20 50 100
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification
1
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight convolutional Neural network Depthwise Dilated Separable convolution Hierarchical multi-scale Feature Fusion
下载PDF
Multi-Scale Convolutional Gated Recurrent Unit Networks for Tool Wear Prediction in Smart Manufacturing 被引量:2
2
作者 Weixin Xu Huihui Miao +3 位作者 Zhibin Zhao Jinxin Liu Chuang Sun Ruqiang Yan 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2021年第3期130-145,共16页
As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symboli... As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models. 展开更多
关键词 Tool wear prediction multi-scale convolutional neural networks Gated recurrent unit
下载PDF
Pedestrian attribute classification with multi-scale and multi-label convolutional neural networks
3
作者 朱建清 Zeng Huanqiang +2 位作者 Zhang Yuzhao Zheng Lixin Cai Canhui 《High Technology Letters》 EI CAS 2018年第1期53-61,共9页
Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label c... Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin. 展开更多
关键词 PEDESTRIAN ATTRIBUTE CLASSIFICATION multi-scale features MULTI-LABEL CLASSIFICATION convolutional NEURAL network (CNN)
下载PDF
Chinese named entity recognition with multi-network fusion of multi-scale lexical information
4
作者 Yan Guo Hong-Chen Liu +3 位作者 Fu-Jiang Liu Wei-Hua Lin Quan-Sen Shao Jun-Shun Su 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第4期53-80,共28页
Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ... Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision. 展开更多
关键词 Bi-directional long short-term memory(BiLSTM) Chinese named entity recognition(CNER) Iterated dilated convolutional neural network(IDCNN) Multi-network integration multi-scale lexical features
下载PDF
Image Denoising Using Dual Convolutional Neural Network with Skip Connection
5
作者 Mengnan Lü Xianchun Zhou +2 位作者 Zhiting Du Yuze Chen Binxin Tang 《Instrumentation》 2024年第3期74-85,共12页
In recent years, deep convolutional neural networks have shown superior performance in image denoising. However, deep network structures often come with a large number of model parameters, leading to high training cos... In recent years, deep convolutional neural networks have shown superior performance in image denoising. However, deep network structures often come with a large number of model parameters, leading to high training costs and long inference times, limiting their practical application in denoising tasks. This paper proposes a new dual convolutional denoising network with skip connections(DECDNet), which achieves an ideal balance between denoising effect and network complexity. The proposed DECDNet consists of a noise estimation network, a multi-scale feature extraction network, a dual convolutional neural network, and dual attention mechanisms. The noise estimation network is used to estimate the noise level map, and the multi-scale feature extraction network is combined to improve the model's flexibility in obtaining image features. The dual convolutional neural network branch design includes convolution and dilated convolution interactive connections, with the lower branch consisting of dilated convolution layers, and both branches using skip connections. Experiments show that compared with other models, the proposed DECDNet achieves superior PSNR and SSIM values at all compared noise levels, especially at higher noise levels, showing robustness to images with higher noise levels. It also demonstrates better visual effects, maintaining a balance between denoising and detail preservation. 展开更多
关键词 image denoising convolutional neural network skip connections multi-scale feature extraction network noise estimation network
下载PDF
A multi-scale convolutional auto-encoder and its application in fault diagnosis of rolling bearings 被引量:10
6
作者 Ding Yunhao Jia Minping 《Journal of Southeast University(English Edition)》 EI CAS 2019年第4期417-423,共7页
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ... Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data. 展开更多
关键词 fault diagnosis deep learning convolutional auto-encoder multi-scale convolutional kernel feature extraction
下载PDF
MSSTNet:Multi-scale facial videos pulse extraction network based on separable spatiotemporal convolution and dimension separable attention
7
作者 Changchen ZHAO Hongsheng WANG Yuanjing FENG 《Virtual Reality & Intelligent Hardware》 2023年第2期124-141,共18页
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi... Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction. 展开更多
关键词 Remote photoplethysmography Heart rate Separable spatiotemporal convolution Dimension separable attention multi-scale Neural network
下载PDF
Occluded Gait Emotion Recognition Based on Multi-Scale Suppression Graph Convolutional Network
8
作者 Yuxiang Zou Ning He +2 位作者 Jiwu Sun Xunrui Huang Wenhua Wang 《Computers, Materials & Continua》 SCIE EI 2025年第1期1255-1276,共22页
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac... In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods. 展开更多
关键词 KNN interpolation multi-scale temporal convolution suppression graph convolutional network gait emotion recognition human skeleton
下载PDF
Lightweight Image Super-Resolution via Weighted Multi-Scale Residual Network 被引量:8
9
作者 Long Sun Zhenbing Liu +3 位作者 Xiyan Sun Licheng Liu Rushi Lan Xiaonan Luo 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第7期1271-1280,共10页
The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods ha... The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN. 展开更多
关键词 convolutional neural network(CNN) lightweight framework multi-scale SUPER-RESOLUTION
下载PDF
Forest fire smoke recognition based on convolutional neural network 被引量:3
10
作者 Xiaofang Sun Liping Sun Yinglai Huang 《Journal of Forestry Research》 SCIE CAS CSCD 2021年第5期1921-1927,共7页
Traditional fire smoke detection methods mostly rely on manual algorithm extraction and sensor detection;however,these methods are slow and expensive to achieve discrimination.We proposed an improved convolutional neu... Traditional fire smoke detection methods mostly rely on manual algorithm extraction and sensor detection;however,these methods are slow and expensive to achieve discrimination.We proposed an improved convolutional neural network(CNN)to achieve fast analysis.The improved CNN can be used to liberate manpower.The network does not require complicated manual feature extraction to identify forest fire smoke.First,to alleviate the computational pressure and speed up the discrimination efficiency,kernel principal component analysis was performed on the experimental data set.To improve the robustness of the CNN and to avoid overfitting,optimization strategies were applied in multi-convolution kernels and batch normalization to improve loss functions.The experimental analysis shows that the CNN proposed in this study can learn the feature information automatically for smoke images in the early stages of fire automatically with a high recognition rate.As a result,the improved CNN enriches the theory of smoke discrimination in the early stages of a forest fire. 展开更多
关键词 Forest fire smoke convolutional neural network Image classification kernel principal component analysis
下载PDF
Defect Detection Algorithm of Patterned Fabrics Based on Convolutional Neural Network 被引量:1
11
作者 XU Yang FEI Libin +1 位作者 YU Zhiqi SHENG Xiaowei 《Journal of Donghua University(English Edition)》 CAS 2021年第1期36-42,共7页
The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly... The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory. 展开更多
关键词 patterned fabrics defect detection convolutional neural network(CNN) multi-scale model cascade network
下载PDF
A Multi-Scale Network with the Encoder-Decoder Structure for CMR Segmentation 被引量:1
12
作者 Chaoyang Xia Jing Peng +1 位作者 Zongqing Ma Xiaojie Li 《Journal of Information Hiding and Privacy Protection》 2019年第3期109-117,共9页
Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are ... Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%. 展开更多
关键词 Cardiac magnetic resonance imaging multi-scale semantic segmentation convolutional neural networks
下载PDF
A Low-Power 12-Bit SAR ADC for Analog Convolutional Kernel of Mixed-Signal CNN Accelerator
13
作者 Jungyeon Lee Malik Summair Asghar HyungWon Kim 《Computers, Materials & Continua》 SCIE EI 2023年第5期4357-4375,共19页
As deep learning techniques such as Convolutional Neural Networks(CNNs)are widely adopted,the complexity of CNNs is rapidly increasing due to the growing demand for CNN accelerator system-on-chip(SoC).Although convent... As deep learning techniques such as Convolutional Neural Networks(CNNs)are widely adopted,the complexity of CNNs is rapidly increasing due to the growing demand for CNN accelerator system-on-chip(SoC).Although conventional CNN accelerators can reduce the computational time of learning and inference tasks,they tend to occupy large chip areas due to many multiply-and-accumulate(MAC)operators when implemented in complex digital circuits,incurring excessive power consumption.To overcome these drawbacks,this work implements an analog convolutional filter consisting of an analog multiply-and-accumulate arithmetic circuit along with an analog-to-digital converter(ADC).This paper introduces the architecture of an analog convolutional kernel comprised of low-power ultra-small circuits for neural network accelerator chips.ADC is an essential component of the analog convolutional kernel used to convert the analog convolutional result to digital values to be stored in memory.This work presents the implementation of a highly low-power and area-efficient 12-bit Successive Approximation Register(SAR)ADC.Unlink most other SAR-ADCs with differential structure;the proposed ADC employs a single-ended capacitor array to support the preceding single-ended max-pooling circuit along with minimal power consumption.The SARADCimplementation also introduces a unique circuit that reduces kick-back noise to increase performance.It was implemented in a test chip using a 55 nm CMOS process.It demonstrates that the proposed ADC reduces Kick-back noise by 40%and consequently improves the ADC’s resolution by about 10%while providing a near rail-to-rail dynamic rangewith significantly lower power consumption than conventional ADCs.The ADC test chip shows a chip size of 4600μm^(2)with a power consumption of 6.6μW while providing an signal-to-noise-and-distortion ratio(SNDR)of 68.45 dB,corresponding to an effective number of bits(ENOB)of 11.07 bits. 展开更多
关键词 convolution neural networks split-capacitor-based digital-toanalog converter(DAC) SAR analog-to-digital converter artificial intelligence SYSTEM-ON-CHIP analog convolutional kernel
下载PDF
A novel multi-resolution network for the open-circuit faults diagnosis of automatic ramming drive system 被引量:1
14
作者 Liuxuan Wei Linfang Qian +3 位作者 Manyi Wang Minghao Tong Yilin Jiang Ming Li 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第4期225-237,共13页
The open-circuit fault is one of the most common faults of the automatic ramming drive system(ARDS),and it can be categorized into the open-phase faults of Permanent Magnet Synchronous Motor(PMSM)and the open-circuit ... The open-circuit fault is one of the most common faults of the automatic ramming drive system(ARDS),and it can be categorized into the open-phase faults of Permanent Magnet Synchronous Motor(PMSM)and the open-circuit faults of Voltage Source Inverter(VSI). The stator current serves as a common indicator for detecting open-circuit faults. Due to the identical changes of the stator current between the open-phase faults in the PMSM and failures of double switches within the same leg of the VSI, this paper utilizes the zero-sequence voltage component as an additional diagnostic criterion to differentiate them.Considering the variable conditions and substantial noise of the ARDS, a novel Multi-resolution Network(Mr Net) is proposed, which can extract multi-resolution perceptual information and enhance robustness to the noise. Meanwhile, a feature weighted layer is introduced to allocate higher weights to characteristics situated near the feature frequency. Both simulation and experiment results validate that the proposed fault diagnosis method can diagnose 25 types of open-circuit faults and achieve more than98.28% diagnostic accuracy. In addition, the experiment results also demonstrate that Mr Net has the capability of diagnosing the fault types accurately under the interference of noise signals(Laplace noise and Gaussian noise). 展开更多
关键词 Fault diagnosis Deep learning multi-scale convolution Open-circuit convolutional neural network
下载PDF
Clothing Parsing Based on Multi-Scale Fusion and Improved Self-Attention Mechanism
15
作者 陈诺 王绍宇 +3 位作者 陆然 李文萱 覃志东 石秀金 《Journal of Donghua University(English Edition)》 CAS 2023年第6期661-666,共6页
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th... Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task. 展开更多
关键词 clothing parsing convolutional neural network multi-scale fusion self-attention mechanism vision Transformer
下载PDF
A Lightweight Network with Dual Encoder and Cross Feature Fusion for Cement Pavement Crack Detection
16
作者 Zhong Qu Guoqing Mu Bin Yuan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期255-273,共19页
Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of cr... Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage space.This limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile devices.To solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature fusion.Firstly,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of images.In addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context information.Finally,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction map.We evaluate our method on three public crack datasets:DeepCrack,CFD,and Crack500.Experimental results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight crack detectionmodel,the parameter count of the model in real-world detection scenarios has been significantly reduced to less than 2M.This advancement also facilitates technical support for portable scene detection. 展开更多
关键词 Shallow feature extraction module large kernel atrous convolution dual encoder lightweight network crack detection
下载PDF
基于YOLOv8改进的跌倒检测算法:CASL-YOLO
17
作者 徐慧英 赵蕊 +1 位作者 朱信忠 黄晓 《浙江师范大学学报(自然科学版)》 CAS 2025年第1期36-44,共9页
跌倒对老年人危害极大,是我国65岁以上老年人致残和伤害死亡的首要原因.然而,目前主流的跌倒检测技术受环境的干扰较大,在物体遮挡、光照变化等复杂场景下的检测准确率较低,且模型的参数量和计算量较高,导致成本居高不下,不能很好地部... 跌倒对老年人危害极大,是我国65岁以上老年人致残和伤害死亡的首要原因.然而,目前主流的跌倒检测技术受环境的干扰较大,在物体遮挡、光照变化等复杂场景下的检测准确率较低,且模型的参数量和计算量较高,导致成本居高不下,不能很好地部署应用于实际生活场景.针对上述问题,提出了一种在复杂环境下轻量级的基于YOLOv8模型改进的跌倒检测算法:CASL-YOLO.首先,该模型引入空间深度卷积(SPD-Conv)模块替代传统卷积模块,通过对每个特征映射进行卷积操作,保留通道维度中的全部信息,从而提高模型在低分辨率图像和小物体检测方面的性能;其次,引入基于位置信息的注意力机制,以捕获跨通道、方向和位置感知的信息,从而更准确地定位和识别人体目标;最后,在特征提取模块中引入选择性大卷积核(LSKNet)动态调整感受野,以有效处理跌倒检测场景中的复杂环境信息,提高网络的感知能力和检测精度.实验结果表明,在公开的Human Fall数据集上,CASL-YOLO的mAP@0.5达到96.8%,优于基线YOLOv8n,同时模型仅有3.4×MiB的参数量和11.7×10^(9)的计算量.相比其他检测算法,CASL-YOLO在参数量和计算量小幅增加的情况下,实现了更高的精度和性能,同时满足实际场景的部署要求. 展开更多
关键词 跌倒检测 YOLOv8 注意力机制 空间深度卷积 选择性大卷积核
下载PDF
基于ASFF-AAKR和CNN-BILSTM滚动轴承寿命预测
18
作者 张永超 刘嵩寿 +2 位作者 陈昱锡 杨海昆 陈庆光 《科学技术与工程》 北大核心 2025年第2期567-573,共7页
针对滚动轴承寿命预测精度低,构建健康指标困难的问题。提出了一种基于自适应特征融合(adaptively spatial feature fusion,ASFF)和自联想核回归模型(auto associative kernel regression,AAKR)与卷积神经网络(convolutional neural net... 针对滚动轴承寿命预测精度低,构建健康指标困难的问题。提出了一种基于自适应特征融合(adaptively spatial feature fusion,ASFF)和自联想核回归模型(auto associative kernel regression,AAKR)与卷积神经网络(convolutional neural networks,CNN)和双向长短期记忆网络(bi-directional long-short term memory,BILSTM)的轴承剩余寿命预测模型。首先,在时域、频域和时频域提取多维特征,利用单调性和趋势性筛选敏感特征;其次利用ASFF-AAKR对敏感特征进行特征融合构建健康指标;最后,将健康指标输入到CNN和BILSTM中,实现对滚动轴承的寿命预测。结果表明:所构建的寿命预测模型优于其他模型,该方法具有更低的误差、寿命预测精度更高。 展开更多
关键词 滚动轴承 自适应特征融合 自联想核回归 卷积神经网络 双向长短期记忆网络 剩余寿命预测
下载PDF
Land cover classification from remote sensing images based on multi-scale fully convolutional network 被引量:7
19
作者 Rui Li Shunyi Zheng +2 位作者 Chenxi Duan Libo Wang Ce Zhang 《Geo-Spatial Information Science》 SCIE EI CSCD 2022年第2期278-294,共17页
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos... Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN. 展开更多
关键词 Spatio-temporal remote sensing images multi-scale Fully convolutional network land cover classification
原文传递
Identification of tomato leaf diseases using convolutional neural network with multi-scale and feature reuse
20
作者 Peng Li Nan Zhong +2 位作者 Wei Dong Meng Zhang Dantong Yang 《International Journal of Agricultural and Biological Engineering》 SCIE 2023年第6期226-235,共10页
Various diseases seriously affect the quality and yield of tomatoes. Fast and accurate identification of disease types is of great significance for the development of smart agriculture. Many Convolution Neural Network... Various diseases seriously affect the quality and yield of tomatoes. Fast and accurate identification of disease types is of great significance for the development of smart agriculture. Many Convolution Neural Network (CNN) models have been applied to the identification of tomato leaf diseases and achieved good results. However, some of these are executed at the cost of large calculation time and huge storage space. This study proposed a lightweight CNN model named MFRCNN, which is established by the multi-scale and feature reuse structure rather than simply stacking convolution layer by layer. To examine the model performances, two types of tomato leaf disease datasets were collected. One is the laboratory-based dataset, including one healthy and nine diseases, and the other is the field-based dataset, including five kinds of diseases. Afterward, the proposed MFRCNN and some popular CNN models (AlexNet, SqueezeNet, VGG16, ResNet18, and GoogLeNet) were tested on the two datasets. The results showed that compared to traditional models, the MFRCNN achieved the optimal performance, with an accuracy of 99.01% and 98.75% in laboratory and field datasets, respectively. The MFRCNN not only had the highest accuracy but also had relatively less computing time and few training parameters. Especially in terms of storage space, the MFRCNN model only needs 2.7 MB of space. Therefore, this work provides a novel solution for plant disease diagnosis, which is of great importance for the development of plant disease diagnosis systems on low-performance terminals. 展开更多
关键词 tomato diseases convolutional neural network confusion matrix multi-scale feature reuse
原文传递
上一页 1 2 15 下一页 到第
使用帮助 返回顶部