期刊文献+
共找到917篇文章
< 1 2 46 >
每页显示 20 50 100
Remaining Useful Life Prediction of Rail Based on Improved Pulse Separable Convolution Enhanced Transformer Encoder
1
作者 Zhongmei Wang Min Li +2 位作者 Jing He Jianhua Liu Lin Jia 《Journal of Transportation Technologies》 2024年第2期137-160,共24页
In order to prevent possible casualties and economic loss, it is critical to accurate prediction of the Remaining Useful Life (RUL) in rail prognostics health management. However, the traditional neural networks is di... In order to prevent possible casualties and economic loss, it is critical to accurate prediction of the Remaining Useful Life (RUL) in rail prognostics health management. However, the traditional neural networks is difficult to capture the long-term dependency relationship of the time series in the modeling of the long time series of rail damage, due to the coupling relationship of multi-channel data from multiple sensors. Here, in this paper, a novel RUL prediction model with an enhanced pulse separable convolution is used to solve this issue. Firstly, a coding module based on the improved pulse separable convolutional network is established to effectively model the relationship between the data. To enhance the network, an alternate gradient back propagation method is implemented. And an efficient channel attention (ECA) mechanism is developed for better emphasizing the useful pulse characteristics. Secondly, an optimized Transformer encoder was designed to serve as the backbone of the model. It has the ability to efficiently understand relationship between the data itself and each other at each time step of long time series with a full life cycle. More importantly, the Transformer encoder is improved by integrating pulse maximum pooling to retain more pulse timing characteristics. Finally, based on the characteristics of the front layer, the final predicted RUL value was provided and served as the end-to-end solution. The empirical findings validate the efficacy of the suggested approach in forecasting the rail RUL, surpassing various existing data-driven prognostication techniques. Meanwhile, the proposed method also shows good generalization performance on PHM2012 bearing data set. 展开更多
关键词 Equipment Health Prognostics Remaining Useful Life Prediction Pulse separable convolution Attention Mechanism Transformer Encoder
下载PDF
Coal/Gangue Volume Estimation with Convolutional Neural Network and Separation Based on Predicted Volume and Weight
2
作者 Zenglun Guan Murad S.Alfarzaeai +2 位作者 Eryi Hu Taqiaden Alshmeri Wang Peng 《Computers, Materials & Continua》 SCIE EI 2024年第4期279-306,共28页
In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using new... In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using newtechnologies and applying different features for recognition.One such method exploits the difference in substancedensity,leading to excellent coal/gangue recognition.Therefore,this study uses density differences to distinguishcoal from gangue by performing volume prediction on the samples.Our training samples maintain a record of3-side images as input,volume,and weight as the ground truth for the classification.The prediction process relieson a Convolutional neural network(CGVP-CNN)model that receives an input of a 3-side image and then extractsthe needed features to estimate an approximation for the volume.The classification was comparatively performedvia ten different classifiers,namely,K-Nearest Neighbors(KNN),Linear Support Vector Machines(Linear SVM),Radial Basis Function(RBF)SVM,Gaussian Process,Decision Tree,Random Forest,Multi-Layer Perceptron(MLP),Adaptive Boosting(AdaBosst),Naive Bayes,and Quadratic Discriminant Analysis(QDA).After severalexperiments on testing and training data,results yield a classification accuracy of 100%,92%,95%,96%,100%,100%,100%,96%,81%,and 92%,respectively.The test reveals the best timing with KNN,which maintained anaccuracy level of 100%.Assessing themodel generalization capability to newdata is essential to ensure the efficiencyof the model,so by applying a cross-validation experiment,the model generalization was measured.The useddataset was isolated based on the volume values to ensure the model generalization not only on new images of thesame volume but with a volume outside the trained range.Then,the predicted volume values were passed to theclassifiers group,where classification reported accuracy was found to be(100%,100%,100%,98%,88%,87%,100%,87%,97%,100%),respectively.Although obtaining a classification with high accuracy is the main motive,this workhas a remarkable reduction in the data preprocessing time compared to related works.The CGVP-CNN modelmanaged to reduce the data preprocessing time of previous works to 0.017 s while maintaining high classificationaccuracy using the estimated volume value. 展开更多
关键词 COAL coal gangue convolutional neural network CNN object classification volume estimation separation system
下载PDF
MSSTNet:Multi-scale facial videos pulse extraction network based on separable spatiotemporal convolution and dimension separable attention
3
作者 Changchen ZHAO Hongsheng WANG Yuanjing FENG 《Virtual Reality & Intelligent Hardware》 2023年第2期124-141,共18页
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi... Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction. 展开更多
关键词 Remote photoplethysmography Heart rate separable spatiotemporal convolution Dimension separable attention MULTI-SCALE Neural network
下载PDF
SepFE:Separable Fusion Enhanced Network for Retinal Vessel Segmentation 被引量:2
4
作者 Yun Wu Ge Jiao Jiahao Liu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第9期2465-2485,共21页
The accurate and automatic segmentation of retinal vessels fromfundus images is critical for the early diagnosis and prevention ofmany eye diseases,such as diabetic retinopathy(DR).Existing retinal vessel segmentation... The accurate and automatic segmentation of retinal vessels fromfundus images is critical for the early diagnosis and prevention ofmany eye diseases,such as diabetic retinopathy(DR).Existing retinal vessel segmentation approaches based on convolutional neural networks(CNNs)have achieved remarkable effectiveness.Here,we extend a retinal vessel segmentation model with low complexity and high performance based on U-Net,which is one of the most popular architectures.In view of the excellent work of depth-wise separable convolution,we introduce it to replace the standard convolutional layer.The complexity of the proposed model is reduced by decreasing the number of parameters and calculations required for themodel.To ensure performance while lowering redundant parameters,we integrate the pre-trained MobileNet V2 into the encoder.Then,a feature fusion residual module(FFRM)is designed to facilitate complementary strengths by enhancing the effective fusion between adjacent levels,which alleviates extraneous clutter introduced by direct fusion.Finally,we provide detailed comparisons between the proposed SepFE and U-Net in three retinal image mainstream datasets(DRIVE,STARE,and CHASEDB1).The results show that the number of SepFE parameters is only 3%of U-Net,the Flops are only 8%of U-Net,and better segmentation performance is obtained.The superiority of SepFE is further demonstrated through comparisons with other advanced methods. 展开更多
关键词 Retinal vessel segmentation U-Net depth-wise separable convolution feature fusion
下载PDF
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification
5
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight convolutional Neural Network Depthwise Dilated separable convolution Hierarchical Multi-Scale Feature Fusion
下载PDF
Validation Research on the Application of Depthwise Separable Convolutional Al Facial Expression Recognition in Non-pharmacological Treatment of BPSD
6
作者 Xiangyu Liu 《Journal of Clinical and Nursing Research》 2021年第4期31-37,共7页
One of the most obvious clinical reasons of dementia or The Behavioral and Psychological Symptoms of Dementia(BPSD)are the lack of emotional expression,the increased frequency of negative emotions,and the impermanence... One of the most obvious clinical reasons of dementia or The Behavioral and Psychological Symptoms of Dementia(BPSD)are the lack of emotional expression,the increased frequency of negative emotions,and the impermanence of emotions.Observing the reduction of BPSD in dementia through emotions can be considered effective and widely used in the field of non-pharmacological therapy.At present,this article will verify whether the image recognition artificial intelligence(AI)system can correctly reflect the emotional performance of the elderly with dementia through a questionnaire survey of three professional elderly nursing staff.The ANOVA(sig.=0.50)is used to determine that the judgment given by the nursing staff has no obvious deviation,and then Kendall's test(0.722**)and spearman's test(0.863**)are used to verify the judgment severity of the emotion recognition system and the nursing staff unanimously.This implies the usability of the tool.Additionally,it can be expected to be further applied in the research related to BPSD elderly emotion detection. 展开更多
关键词 depth-wise separable convolution EMOTION BPSD DEMENTIA Nursing
下载PDF
A WEIGHTED GENERAL DISCRETE FOURIER TRANSFORM FOR THE FREQUENCY-DOMAIN BLIND SOURCE SEPARATION OF CONVOLUTIVE MIXTURES 被引量:1
7
作者 Wang Chao Fang Yong Feng Jiuchao 《Journal of Electronics(China)》 2008年第6期830-833,共4页
This letter deals with the frequency domain Blind Source Separation of Convolutive Mixtures (CMBSS). From the frequency representation of the "overlap and save", a Weighted General Discrete Fourier Transform... This letter deals with the frequency domain Blind Source Separation of Convolutive Mixtures (CMBSS). From the frequency representation of the "overlap and save", a Weighted General Discrete Fourier Transform (WGDFT) is derived to replace the traditional Discrete Fourier Transform (DFT). The mixing matrix on each frequency bin could be estimated more precisely from WGDFT coefficients than from DFT coefficients, which improves separation performance. Simulation results verify the validity of WGDFT for frequency domain blind source separation of convolutive mixtures. 展开更多
关键词 Blind Source separation of convolutive Mixtures (CMBSS) Frequency representation of overlap and save Weighted General Discrete Fourier Transform (WGDFT)
下载PDF
Maximum Likelihood Blind Separation of Convolutively Mixed Discrete Sources
8
作者 辜方林 张杭 朱德生 《China Communications》 SCIE CSCD 2013年第6期60-67,共8页
In this paper,a Maximum Likelihood(ML) approach,implemented by Expectation-Maximization(EM) algorithm,is proposed to blind separation of convolutively mixed discrete sources.In order to carry out the expectation proce... In this paper,a Maximum Likelihood(ML) approach,implemented by Expectation-Maximization(EM) algorithm,is proposed to blind separation of convolutively mixed discrete sources.In order to carry out the expectation procedure of the EM algorithm with a less computational load,the algorithm named Iterative Maximum Likelihood algorithm(IML) is proposed to calculate the likelihood and recover the source signals.An important feature of the ML approach is that it has robust performance in noise environments by treating the covariance matrix of the additive Gaussian noise as a parameter.Another striking feature of the ML approach is that it is possible to separate more sources than sensors by exploiting the finite alphabet property of the sources.Simulation results show that the proposed ML approach works well either in determined mixtures or underdetermined mixtures.Furthermore,the performance of the proposed ML algorithm is close to the performance with perfect knowledge of the channel filters. 展开更多
关键词 Blind Source separation convolutive mixture EM Finite Alphabet
下载PDF
AN NMF ALGORITHM FOR BLIND SEPARATION OF CONVOLUTIVE MIXED SOURCE SIGNALS WITH LEAST CORRELATION CONSTRAINS
9
作者 Zhang Ye Fang Yong 《Journal of Electronics(China)》 2009年第4期557-563,共7页
Most of the existing algorithms for blind sources separation have a limitation that sources are statistically independent. However, in many practical applications, the source signals are non- negative and mutual stati... Most of the existing algorithms for blind sources separation have a limitation that sources are statistically independent. However, in many practical applications, the source signals are non- negative and mutual statistically dependent signals. When the observations are nonnegative linear combinations of nonnegative sources, the correlation coefficients of the observations are larger than these of source signals. In this letter, a novel Nonnegative Matrix Factorization (NMF) algorithm with least correlated component constraints to blind separation of convolutive mixed sources is proposed. The algorithm relaxes the source independence assumption and has low-complexity algebraic com- putations. Simulation results on blind source separation including real face image data indicate that the sources can be successfully recovered with the algorithm. 展开更多
关键词 Nonnegative matrix factorization convolutive blind source separation Correlation constrain
下载PDF
基于语义分割的车位检测算法研究
10
作者 李伟东 李冰 +1 位作者 朱旭浩 李乐 《大连理工大学学报》 CAS CSCD 北大核心 2024年第1期96-103,共8页
作为自动泊车系统中至关重要的一环,车位检测算法的精度直接决定自动泊车系统的好坏.目前,基于语义分割的车位检测算法主要有两个问题:一是分割网络参数量较大,难以满足移动端部署;二是后处理提取算法复杂,难以满足实时检测要求.针对这... 作为自动泊车系统中至关重要的一环,车位检测算法的精度直接决定自动泊车系统的好坏.目前,基于语义分割的车位检测算法主要有两个问题:一是分割网络参数量较大,难以满足移动端部署;二是后处理提取算法复杂,难以满足实时检测要求.针对这两个问题,设计一种通过检测车位线来获取停车位的车位检测算法.采用深度可分离卷积和非对称卷积相结合的方式设计车位线分割网络UFAC-Net,并提出一种更为简洁的车位线提取算法.实验结果表明:UFAC-Net模型(UFAC-Net2)分割的平均像素精度为83.07%,平均交并比为73.05%,模型参数量为3.1 MB,达到目前PSV datasets上最好的分割精度;车位检测算法可检测复杂情况下的平行、垂直、倾斜3种类型的车位,在自定义测试集中精准率为99.23%,召回率为99.12%,单张图像检测时间为32.2 ms,具有良好的检测性能. 展开更多
关键词 车位检测 语义分割 深度可分离卷积 非对称卷积
下载PDF
基于改进SegNet的鸡只检测算法 被引量:1
11
作者 吉训生 孙贝贝 夏圣奎 《计算机工程与设计》 北大核心 2024年第1期102-109,共8页
为实现智能化检测出鸡场中死亡鸡只,提出一种基于改进语义分割模型AT-SegNet的鸡只检测算法。基于对称编码解码结构SegNet,利用空洞卷积在解码前聚合不同感受野的上下文信息,设计一种三尺度注意力级联融合模块,以并联方式嵌入编、解码器... 为实现智能化检测出鸡场中死亡鸡只,提出一种基于改进语义分割模型AT-SegNet的鸡只检测算法。基于对称编码解码结构SegNet,利用空洞卷积在解码前聚合不同感受野的上下文信息,设计一种三尺度注意力级联融合模块,以并联方式嵌入编、解码器间,丰富解码器信息。利用多层深度可分离卷积替代标准卷积,提取深层次语义信息,减少计算量提高实时性。将鸡群图像分割结果交并比与阈值对比判别鸡只状态。实验结果表明,改进的AT-SegNet较原算法的检测精度提高了25.17%,能够在复杂鸡群环境中准确、高效地发现死亡鸡只。 展开更多
关键词 深度学习 鸡只检测 语义分割 编码解码结构 注意力机制 软池化 深度可分离卷积
下载PDF
一种基于YOLOX_s的雾天场景目标检测方法
12
作者 娄铮铮 张欣 +1 位作者 胡世哲 吴云鹏 《计算机科学》 CSCD 北大核心 2024年第7期206-213,共8页
文中提出了一个基于深度可分离卷积和注意力机制的雾天目标检测模型,旨在实现在雾天场景中对目标的快速、准确检测。该模型由去雾模块和检测模块组成,并在训练过程中共同训练。为确保模型在雾天场景中检测的准确性和实时性,在去雾模块方... 文中提出了一个基于深度可分离卷积和注意力机制的雾天目标检测模型,旨在实现在雾天场景中对目标的快速、准确检测。该模型由去雾模块和检测模块组成,并在训练过程中共同训练。为确保模型在雾天场景中检测的准确性和实时性,在去雾模块方面,采用AODNet对输入图像进行去雾处理,以降低雾对图像中待检测目标的干扰,在检测模块中使用改进后的YOLOX_s模型,输出目标的分类置信度和位置坐标。为提升网络的检测性能,在YOLOX_s基础上采用深度可分离卷积和注意力机制来提高特征提取能力,扩大特征图感受野。所提模型能提高有雾场景中模型的检测精度,且不增加模型参数量和计算量。实验结果表明,所提模型在RTTS数据集和合成有雾目标检测数据集上均表现出色,有效提高了模型在雾天场景中的检测精度。与基准模型相比,平均精度(mAP@50_95)分别提升了1.9%和2.37%。 展开更多
关键词 雾天场景 目标检测 图像去雾 深度可分离卷积 注意力机制
下载PDF
Recognition of Bird Species of Yunnan Based on Improved ResNet18
13
作者 Wei Yang Ivy Kim D.Machica 《Intelligent Automation & Soft Computing》 2024年第5期889-905,共17页
Birds play a crucial role in maintaining ecological balance,making bird recognition technology a hot research topic.Traditional recognition methods have not achieved high accuracy in bird identification.This paper pro... Birds play a crucial role in maintaining ecological balance,making bird recognition technology a hot research topic.Traditional recognition methods have not achieved high accuracy in bird identification.This paper proposes an improved ResNet18 model to enhance the recognition rate of local bird species in Yunnan.First,a dataset containing five species of local birds in Yunnan was established:C.amherstiae,T.caboti,Syrmaticus humiae,Polyplectron bicalcaratum,and Pucrasia macrolopha.The improved ResNet18 model was then used to identify these species.This method replaces traditional convolution with depth wise separable convolution and introduces an SE(Squeeze and Excitation)module to improve the model’s efficiency and accuracy.Compared to the traditional ResNet18 model,this improved model excels in implementing a wild bird classification solution,significantly reducing computational overhead and accelerating model training using low-power,lightweight hardware.Experimental analysis shows that the improved ResNet18 model achieved an accuracy of 98.57%,compared to 98.26%for the traditional Residual Network 18 layers(ResNet18)model. 展开更多
关键词 Bird species recognition ResNet18 depth-wise separable convolutions
下载PDF
基于ViT的轻量级恶意代码检测架构
14
作者 黄保华 杨婵娟 +1 位作者 熊宇 庞飔 《信息网络安全》 CSCD 北大核心 2024年第9期1409-1421,共13页
随着信息社会的快速发展,恶意代码变体日益增多,给现有的检测方法带来了挑战。为了提高恶意代码变体的检测准确率和效率,文章提出一种新的混合架构FasterMalViT。该架构通过融合部分卷积结构改进ViT,显著提升其在恶意代码检测领域的性... 随着信息社会的快速发展,恶意代码变体日益增多,给现有的检测方法带来了挑战。为了提高恶意代码变体的检测准确率和效率,文章提出一种新的混合架构FasterMalViT。该架构通过融合部分卷积结构改进ViT,显著提升其在恶意代码检测领域的性能。为了解决引入卷积操作导致参数量增加的问题,文章采用可分离自注意力机制替代传统的多头注意力,有效减少了参数量,降低了计算成本。针对恶意代码数据集中各类样本分布不均衡的问题,文章引入类别平衡焦点损失函数,引导模型在训练过程中更关注样本数量较少的类别,从而提高难分类别的性能。在Microsoft BIG、Malimg数据集和MalwareBazaar数据集上的实验结果表明,FasterMalViT具有较好的检测性能和泛化能力。 展开更多
关键词 恶意代码 VIT 部分卷积 可分离自注意力
下载PDF
基于改进YOLOv5s的面向自动驾驶场景的道路目标检测算法 被引量:1
15
作者 胡丹丹 张忠婷 《智能系统学报》 CSCD 北大核心 2024年第3期653-660,共8页
在复杂道路场景中检测车辆、行人、自行车等目标时,存在因多尺度目标及部分遮挡易造成漏检及误检等情况,提出一种基于改进YOLOv5s的面向自动驾驶场景的道路目标检测算法。首先,利用深度可分离卷积替换部分普通卷积,减少模型的参数量以... 在复杂道路场景中检测车辆、行人、自行车等目标时,存在因多尺度目标及部分遮挡易造成漏检及误检等情况,提出一种基于改进YOLOv5s的面向自动驾驶场景的道路目标检测算法。首先,利用深度可分离卷积替换部分普通卷积,减少模型的参数量以提升检测速度。其次,在特征融合网络中引入基于感受野模块(receptive field block,RFB)改进的RFB-s,通过模仿人类视觉感知,增强特征图的有效感受野区域,提高网络特征表达能力及对目标特征的可辨识性。最后,使用自适应空间特征融合(adaptively spatial feature fusion,ASFF)方式以提升PANet对多尺度特征融合的效果。实验结果表明,在PASCAL VOC数据集上,所提算法检测平均精度均值相较于YOLOv5s提高1.71个百分点,达到84.01%,在满足自动驾驶汽车实时性要求的前提下,在一定程度上减少目标检测时的误检及漏检情况,有效提升模型在复杂驾驶场景下的检测性能。 展开更多
关键词 YOLOv5s 自动驾驶 目标检测算法 深度可分离卷积 感受野模块 自适应空间特征融合 PANet 多尺度特征融合
下载PDF
裂缝小目标缺陷的轻量化检测方法
16
作者 贾晓芬 江再亮 赵佰亭 《湖南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第6期52-62,共11页
及时且准确捕获井壁出现的微小裂缝,对于井筒安全意义重大.轻量化检测模型是推动井壁裂缝自动检测的关键,打破现有方法聚焦于提取深层语义信息的局限,重视浅层特征表征的几何结构信息的应用,针对井壁裂缝提出轻量化检测模型E-YOLOv5s.... 及时且准确捕获井壁出现的微小裂缝,对于井筒安全意义重大.轻量化检测模型是推动井壁裂缝自动检测的关键,打破现有方法聚焦于提取深层语义信息的局限,重视浅层特征表征的几何结构信息的应用,针对井壁裂缝提出轻量化检测模型E-YOLOv5s.首先融合普通卷积、深度可分离卷积和ECA注意力机制设计轻量化卷积模块ECAConv,再引入跳跃链接构建特征综合提取单元E-C3,得到骨干网络ECSP-Darknet53,它负责显著降低网络参数,同时增强对裂缝深层特征的提取能力.然后设计特征融合模块ECACSP,利用多组ECAConv和ECACSP模块组建细颈部特征融合模块E-Neck,旨在充分融合裂缝小目标的几何信息和表征裂缝开裂程度的语义信息,同时加快网络推理速度.实验表明,E-YOLOv5s在自制井壁数据集上的检测精度相较YOLOv5s提升了4.0%,同时模型参数量和GFLOPs分别降低了44.9%、43.7%.E-YOLOv5s有助于推动井壁裂缝自动检测的应用. 展开更多
关键词 裂缝缺陷 小目标 深度学习 深度可分离卷积
下载PDF
改进U_Net网络的钢结构表面锈蚀图像分割方法
17
作者 陈法法 董海飞 +1 位作者 何向阳 陈保家 《电子测量与仪器学报》 CSCD 北大核心 2024年第2期49-57,共9页
为实现锈蚀图像分割网络模型轻量化,同时消除非单一特征背景和锈液等类似特征背景干扰,本文将U_Net网络模型的编码部分替换为MobilenetV3_Large网络,导入基于ImageNet数据集的MobilenetV3_Large网络预训练权重,将U_Net网络模型解码部分... 为实现锈蚀图像分割网络模型轻量化,同时消除非单一特征背景和锈液等类似特征背景干扰,本文将U_Net网络模型的编码部分替换为MobilenetV3_Large网络,导入基于ImageNet数据集的MobilenetV3_Large网络预训练权重,将U_Net网络模型解码部分的普通卷积替换为深度可分离残差卷积,并在上采样的过程中添加注意力导向AG模块和Dropout机制。经实验验证表明,本文设计的改进U_Net网络模型在非单一特征背景和锈液等类似特征背景干扰下,具有明显的锈蚀图像分割优势,相比于原U_Net网络模型,模型大小减少了81.18%,浮点计算量减少了98.34%,检测效率提升了3.27倍,即从原来不足6 fps,提升至19 fps。网络模型实现轻量化的同时,网络模型的准确率达95.54%,相比于原U_Net网络模型提升了5.04%。 展开更多
关键词 锈蚀区域分割 MobilenetV3 U_Net 注意力导向 深度可分离残差卷积
下载PDF
基于扩张卷积和Transformer的视听融合语音分离方法
18
作者 刘宏清 谢奇洲 +1 位作者 赵宇 周翊 《信号处理》 CSCD 北大核心 2024年第7期1208-1217,共10页
为了提高语音分离的效果,除了利用混合的语音信号,还可以借助视觉信号作为辅助信息。这种融合了视觉与音频信号的多模态建模方式,已被证实可以有效地提高语音分离的性能,为语音分离任务提供了新的可能性。为了更好地捕捉视觉与音频特征... 为了提高语音分离的效果,除了利用混合的语音信号,还可以借助视觉信号作为辅助信息。这种融合了视觉与音频信号的多模态建模方式,已被证实可以有效地提高语音分离的性能,为语音分离任务提供了新的可能性。为了更好地捕捉视觉与音频特征中的长期依赖关系,并强化网络对输入上下文信息的理解,本文提出了一种基于一维扩张卷积与Transformer的时域视听融合语音分离模型。将基于频域的传统视听融合语音分离方法应用到时域中,避免了时频变换带来的信息损失和相位重构问题。所提网络架构包含四个模块:一个视觉特征提取网络,用于从视频帧中提取唇部嵌入特征;一个音频编码器,用于将混合语音转换为特征表示;一个多模态分离网络,主要由音频子网络、视频子网络,以及Transformer网络组成,用于利用视觉和音频特征进行语音分离;以及一个音频解码器,用于将分离后的特征还原为干净的语音。本文使用LRS2数据集生成的包含两个说话者混合语音的数据集。实验结果表明,所提出的网络在尺度不变信噪比改进(Scale-Invariant Signal-to-Noise Ratio Improvement,SISNRi)与信号失真比改进(Signal-to-Distortion Ratio Improvement,SDRi)这两种指标上分别达到14.0 dB与14.3 dB,较纯音频分离模型和普适的视听融合分离模型有明显的性能提升。 展开更多
关键词 语音分离 视听融合 多头自注意力机制 扩张卷积
下载PDF
CSD-YOLOv8s:基于无人机图像的密集小目标羊只检测模型
19
作者 翁智 刘海鑫 郑志强 《智慧农业(中英文)》 CSCD 2024年第4期42-52,共11页
[目的/意义]天然牧场下放牧牲畜数量的准确检测是规模化养殖场改造升级的关键。为满足规模化养殖场对大批羊群实现精准实时的检测需求,提出一种高精度、易部署的小目标检测模型CSD-YOLOv8s (CBAM SPPFCSPC DSConv-YOLOv8s),实现无人机... [目的/意义]天然牧场下放牧牲畜数量的准确检测是规模化养殖场改造升级的关键。为满足规模化养殖场对大批羊群实现精准实时的检测需求,提出一种高精度、易部署的小目标检测模型CSD-YOLOv8s (CBAM SPPFCSPC DSConv-YOLOv8s),实现无人机高空视角下小目标羊只个体的实时检测。[方法]首先,使用无人机获取天然草原牧场中包含不同背景及光照条件下的羊群视频数据并与下载的部分公开数据集共同构成原始图像数据。通过数据清洗和标注整理生成羊群检测数据集。其次,为解决羊群密集和相互遮挡造成的羊只检测困难问题,基于YOLO (You Only Look Once) v8模型构建具有跨阶段局部连接的SPPFCSPC (Spatial Pyramid Pooling Fast-CSPC)模块,提升网络特征提取和特征融合能力,增强模型对小目标羊只的检测性能。在模型的Neck部分引入了卷积注意力模块(Convolutional Block Attention Module, CBAM),从通道和空间两个维度增强网络的抗干扰能力,提升网络对复杂背景的抑制能力,进一步提高对密集羊群的检测性能。最后,为提升模型的实时性和可部署性,将Neck网络的标准卷积改为具有可变化内核的轻量卷积C2f_DS (C2f-DSConv)模块,减小了模型的参数量并提升了模型的检测速度。[结果和讨论]与YOLO、Faster R-CNN (Faster Regions with Convolutional Neural Networks)及其他经典网络模型相比,改进后的CSD-YOLOv8s模型在检测速度和模型大小相当的情况下,在羊群检测任务中具有更高的检测精度。Precision达到95.2%,mAP达到93.1%,FPS (Frames Per Second)达到87 f/s,并对不同遮挡程度的羊只目标具有较强的鲁棒性,有效解决了无人机检测任务中因羊只目标小、背景噪声大、密集程度高导致羊群漏检和误检严重的问题。公开数据集验证结果表明,提出的模型对其他不同物体的检测精度均有所提高,特别是在羊只检测方面,检测精度提升了9.7%。[结论]提出的CSD-YOLOv8s在无人机图像中更精准地检测草原放牧牲畜,对不同程度的聚集和遮挡目标实现精准检测,且具有较好的实时性,为养殖场大规模畜禽检测提供了技术支撑,具有广泛的应用潜力。 展开更多
关键词 羊只检测 YOLOv8 小目标 SPPFCSPC 注意力机制 深度可分离卷积
下载PDF
基于动态深度可分离卷积神经网络的管道泄漏孔径识别
20
作者 王秀芳 刘源 李月明 《中国石油大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第5期183-189,共7页
针对传统模型为提高管道泄漏检测的精度而导致的模型结构复杂度、参数量和计算量大的问题,提出一种基于动态深度可分离卷积神经网络的管道泄漏孔径识别方法;动态卷积层将提取到的泄漏信号特征经过通道注意力权值计算和动态权值融合,通... 针对传统模型为提高管道泄漏检测的精度而导致的模型结构复杂度、参数量和计算量大的问题,提出一种基于动态深度可分离卷积神经网络的管道泄漏孔径识别方法;动态卷积层将提取到的泄漏信号特征经过通道注意力权值计算和动态权值融合,通过动态深度可分离卷积层获得更强的特征表达能力,利用全局平均池化层降低网络模型参数,通过全连接层识别管道泄漏孔径。结果表明:新方法具有较高的识别精度,克服了传统模型资源开销大、功耗高的问题,降低了模型的训练时间,提升了管道泄漏孔径的识别速度,可用于工业中的管道泄漏程度监测。 展开更多
关键词 泄漏孔径识别 动态深度可分离卷积 轻量化网络 动态卷积
下载PDF
上一页 1 2 46 下一页 到第
使用帮助 返回顶部