期刊文献+
共找到34,064篇文章
< 1 2 250 >
每页显示 20 50 100
基于Vision Transformer的虹膜——人脸多特征融合识别研究
1
作者 马滔 陈睿 张博 《中国新技术新产品》 2024年第18期8-10,共3页
为了提高生物特征识别系统的准确性和鲁棒性,本文研究基于计算机视觉的虹膜—人脸多特征融合识别方法。本文对面部图像中虹膜区域进行提取以及预处理,采用对比度增强和归一化操作,加强了特征提取的一致性,提升了图像质量。为了获取丰富... 为了提高生物特征识别系统的准确性和鲁棒性,本文研究基于计算机视觉的虹膜—人脸多特征融合识别方法。本文对面部图像中虹膜区域进行提取以及预处理,采用对比度增强和归一化操作,加强了特征提取的一致性,提升了图像质量。为了获取丰富的深度特征,本文使用Vision Transformer模型对预处理后的虹膜和面部图像进行特征提取。利用多头注意力机制将虹膜和面部的多模态特征信息进行融合,再利用全连接层进行分类识别。试验结果表明,该方法识别性能优秀,识别准确性显著提升。 展开更多
关键词 计算机视觉 vision Transformer 多特征融合 虹膜识别 人脸识别
下载PDF
Dual-Path Vision Transformer用于急性缺血性脑卒中辅助诊断
2
作者 张桃红 郭学强 +4 位作者 郑瀚 罗继昌 王韬 焦力群 唐安莹 《电子科技大学学报》 EI CAS CSCD 北大核心 2024年第2期307-314,共8页
急性缺血性脑卒中是由于脑组织血液供应障碍导致的脑功能障碍,数字减影脑血管造影(DSA)是诊断脑血管疾病的金标准。基于患者的正面和侧面DSA图像,对急性缺血性脑卒中的治疗效果进行分级评估,构建基于Vision Transformer的双路径图像分... 急性缺血性脑卒中是由于脑组织血液供应障碍导致的脑功能障碍,数字减影脑血管造影(DSA)是诊断脑血管疾病的金标准。基于患者的正面和侧面DSA图像,对急性缺血性脑卒中的治疗效果进行分级评估,构建基于Vision Transformer的双路径图像分类智能模型DPVF。为了提高辅助诊断速度,基于EdgeViT的轻量化设计思想进行了模型的构建;为了使模型保持轻量化的同时具有较高的精度,提出空间-通道自注意力模块,促进Transformer模型捕获更全面的特征信息,提高模型的表达能力;此外,对于DPVF的两分支的特征融合,构建交叉注意力模块对两分支输出进行交叉融合,促使模型提取更丰富的特征,从而提高模型表现。实验结果显示DPVF在测试集上的准确率达98.5%,满足实际需求。 展开更多
关键词 急性缺血性脑卒中 视觉Transformer 双分支网络 特征融合
下载PDF
基于Vision Transformer的小麦病害图像识别算法
3
作者 白玉鹏 冯毅琨 +3 位作者 李国厚 赵明富 周浩宇 侯志松 《中国农机化学报》 北大核心 2024年第2期267-274,共8页
小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,... 小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,并对原始图像进行预处理,建立小麦病害图像识别数据集;然后,基于改进的Vision Transformer构建小麦病害图像识别算法,分析不同迁移学习方式和数据增强对模型识别效果的影响。试验可知,全参数迁移学习和数据增强能明显提高Vision Transformer模型的收敛速度和识别精度。最后,在相同时间条件下,对比Vision Transformer、AlexNet和VGG16算法在相同数据集上的表现。试验结果表明,Vision Transformer模型对3种小麦病害图像的平均识别准确率为96.81%,相较于AlexNet和VGG16模型识别准确率分别提高6.68%和4.94%。 展开更多
关键词 小麦病害 vision Transformer 迁移学习 图像识别 数据增强
下载PDF
基于Vision Transformer与迁移学习的裤装廓形识别与分类
4
作者 应欣 张宁 申思 《丝绸》 CAS CSCD 北大核心 2024年第11期77-83,共7页
针对裤装廓形识别与分类模型的分类不准确问题,文章采用带有自注意力机制的Vision Transformer模型实现裤装廓形图像的分类,对于图片背景等无关信息对廓形识别的干扰,添加自注意力机制,增强有用特征通道。为防止因裤型样本数据集较少产... 针对裤装廓形识别与分类模型的分类不准确问题,文章采用带有自注意力机制的Vision Transformer模型实现裤装廓形图像的分类,对于图片背景等无关信息对廓形识别的干扰,添加自注意力机制,增强有用特征通道。为防止因裤型样本数据集较少产生过拟合问题,可通过迁移学习方法对阔腿裤、喇叭裤、紧身裤、哈伦裤4种裤装廓形进行训练和验证,将改进的Vision Transformer模型与传统CNN模型进行对比实验,验证模型效果。实验结果表明:使用Vision Transformer模型在4种裤装廓形分类上的分类准确率达到97.72%,与ResNet-50和MobileNetV2模型相比均有提升,可为服装廓形的图像分类识别提供有力支撑,在实际服装领域中有较高的使用价值。 展开更多
关键词 裤装廓形 自注意力机制 vision transformer 迁移学习 图像分类 廓形识别
下载PDF
细粒度图像分类上Vision Transformer的发展综述
5
作者 孙露露 刘建平 +3 位作者 王健 邢嘉璐 张越 王晨阳 《计算机工程与应用》 CSCD 北大核心 2024年第10期30-46,共17页
细粒度图像分类(fine-grained image classification,FGIC)一直是计算机视觉领域中的重要问题。与传统图像分类任务相比,FGIC的挑战在于类间对象极其相似,使任务难度进一步增加。随着深度学习的发展,Vision Transformer(ViT)模型在视觉... 细粒度图像分类(fine-grained image classification,FGIC)一直是计算机视觉领域中的重要问题。与传统图像分类任务相比,FGIC的挑战在于类间对象极其相似,使任务难度进一步增加。随着深度学习的发展,Vision Transformer(ViT)模型在视觉领域掀起热潮,并被引入到FGIC任务中。介绍了FGIC任务所面临的挑战,分析了ViT模型及其特性。主要根据模型结构全面综述了基于ViT的FGIC算法,包括特征提取、特征关系构建、特征注意和特征增强四方面内容,对每种算法进行了总结,并分析了它们的优缺点。通过对不同ViT模型在相同公用数据集上进行模型性能比较,以验证它们在FGIC任务上的有效性。最后指出了目前研究的不足,并提出未来研究方向,以进一步探索ViT在FGIC中的潜力。 展开更多
关键词 细粒度图像分类 vision Transformer 特征提取 特征关系构建 特征注意 特征增强
下载PDF
Collaborative positioning for swarms:A brief survey of vision,LiDAR and wireless sensors based methods 被引量:1
6
作者 Zeyu Li Changhui Jiang +3 位作者 Xiaobo Gu Ying Xu Feng zhou Jianhui Cui 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期475-493,共19页
As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from bo... As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from both its environment and other agents,an agent can use various methods and sensor types to localize itself.With its high flexibility and robustness,collaborative positioning has become a widely used method in both military and civilian applications.This paper introduces the basic fundamental concepts and applications of collaborative positioning,and reviews recent progress in the field based on camera,LiDAR(Light Detection and Ranging),wireless sensor,and their integration.The paper compares the current methods with respect to their sensor type,summarizes their main paradigms,and analyzes their evaluation experiments.Finally,the paper discusses the main challenges and open issues that require further research. 展开更多
关键词 Collaborative positioning vision LIDAR Wireless sensors Sensor fusion
下载PDF
基于Vision Transformer的阿尔茨海默病分类研究
7
作者 许曙博 郑英豪 +3 位作者 秦方博 周超 周劲 陈嘉燕 《微型电脑应用》 2024年第8期4-7,共4页
为了有效地提升对阿尔茨海默病(AD)的磁共振成像(MRI)图像分类准确率,提出一种LC(Layer-Cut)-ViT方法。该方法通过引入Vision Transformer(ViT)的自注意力机制对MRI图像进行层切分,使模型能更好地理解图像的全局信息,同时突出切片间的... 为了有效地提升对阿尔茨海默病(AD)的磁共振成像(MRI)图像分类准确率,提出一种LC(Layer-Cut)-ViT方法。该方法通过引入Vision Transformer(ViT)的自注意力机制对MRI图像进行层切分,使模型能更好地理解图像的全局信息,同时突出切片间的特征关系。此外,通过配准、颅骨分离算法提取MRI图像的脑部组织部分,进一步提升模型的性能。实验结果显示,所提方法对阿尔茨海默病的MRI图像具有较好的分类能力。 展开更多
关键词 阿尔茨海默病 MRI图像分类 vision Transformer LC-ViT
下载PDF
基于Vision Transformer和迁移学习的家庭领域哭声识别
8
作者 王汝旭 王荣燕 +2 位作者 曾科 杨传德 刘超 《智能计算机与应用》 2024年第6期119-126,共8页
针对SVM等传统机器学习算法准确率低和当前使用CNN处理家庭领域哭声识别在不同婴儿间出现泛化能力差的问题,提出了一种基于Vision Transformer和迁移学习的婴儿哭声音频分类算法。首先,为实现数据集样本的扩增,采用了包括梅尔频谱转换... 针对SVM等传统机器学习算法准确率低和当前使用CNN处理家庭领域哭声识别在不同婴儿间出现泛化能力差的问题,提出了一种基于Vision Transformer和迁移学习的婴儿哭声音频分类算法。首先,为实现数据集样本的扩增,采用了包括梅尔频谱转换和数据增强的数据预处理技术,进而达到了增强模型鲁棒性的目的。而后,在微调后的Vision Transformer模型上进行迁移学习训练,同时,训练过程中利用了LookAhead优化器来不断调整模型参数以避免过拟合,最终实验实现了对婴儿哭声音频的自动分类。实验结果表明,本实验模型相比其他深度学习模型具有更高的精确率和更快的收敛速度,同时还能有效地学习到婴儿哭声中更具区分性的特征。可以在新生儿监护、听力筛查和异常检测等领域中发挥重要作用。 展开更多
关键词 vision Transformer模型 婴儿哭声 迁移学习 梅尔频谱图 LOOKAHEAD
下载PDF
基于改进Vision Transformer网络的农作物病害识别方法 被引量:1
9
作者 王杨 李迎春 +6 位作者 许佳炜 王傲 马唱 宋世佳 谢帆 赵传信 胡明 《小型微型计算机系统》 CSCD 北大核心 2024年第4期887-893,共7页
基于DCNN模型的农作物病害识别方法在实验室环境下识别准确率高,但面对噪声时缺少鲁棒性.为了兼顾农作物病害识别的精度和鲁棒性,本文在标准ViT模型基础上加入增强分块序列化和掩码多头注意力,解决标准ViT模型缺乏局部归纳偏置和视觉特... 基于DCNN模型的农作物病害识别方法在实验室环境下识别准确率高,但面对噪声时缺少鲁棒性.为了兼顾农作物病害识别的精度和鲁棒性,本文在标准ViT模型基础上加入增强分块序列化和掩码多头注意力,解决标准ViT模型缺乏局部归纳偏置和视觉特征序列的自注意力过于关注自身的问题.实验结果表明,本文的EPEMMSA-ViT模型对比标准ViT模型可以更高效的从零学习;当添加预训练权重训练网络时,EPEMMSA-ViT模型在数据增强的PlantVillage番茄子集上能够得到99.63%的分类准确率;在添加椒盐噪声的测试数据集上,对比ResNet50、DenseNet121、MobileNet和ConvNeXt的分类准确率分别提升了6.08%、9.78%、29.78%和12.41%;在添加均值模糊的测试数据集上,对比ResNet50、DenseNet121、MobileNet和ConvNeXt的分类准确率分别提升了18.92%、31.11%、20.37%和19.58%. 展开更多
关键词 农作物病害识别 深度卷积神经网络 视觉Transformer 自注意力 局部归纳偏置
下载PDF
Artificial hawk-eye camera for foveated, tetrachromatic, and dynamic vision
10
作者 Wenhao Ran Zhuoran Wang Guozhen Shen 《Journal of Semiconductors》 EI CAS CSCD 2024年第9期1-3,共3页
With the rapid development of drones and autonomous vehicles, miniaturized and lightweight vision sensors that can track targets are of great interests. Limited by the flat structure, conventional image sensors apply ... With the rapid development of drones and autonomous vehicles, miniaturized and lightweight vision sensors that can track targets are of great interests. Limited by the flat structure, conventional image sensors apply a large number of lenses to achieve corresponding functions, increasing the overall volume and weight of the system. 展开更多
关键词 AWK vision system.
下载PDF
基于Vision Transformer和卷积注入的车辆重识别
11
作者 于洋 马浩伟 +2 位作者 岑世欣 李扬 张梦泉 《河北工业大学学报》 CAS 2024年第4期40-50,共11页
针对车辆重识别中提取特征鲁棒性不高的问题,本文提出基于Vision Transformer的车辆重识别方法。首先,利用注意力机制提出目标导向映射模块,并结合辅助信息嵌入模块,抑制由不同视角、相机拍摄及无效背景引入的噪声。其次,以Vision Trans... 针对车辆重识别中提取特征鲁棒性不高的问题,本文提出基于Vision Transformer的车辆重识别方法。首先,利用注意力机制提出目标导向映射模块,并结合辅助信息嵌入模块,抑制由不同视角、相机拍摄及无效背景引入的噪声。其次,以Vision Transformer远距离建模能力为基础提出通道感知模块,通过并行设计模型能够同时获取图像块之间和图像通道之间的特征,在关注图像块之间关联的基础上,进一步构建通道之间的关联。最后,利用卷积神经网络的局部归纳偏置,将全局特征向量输入到卷积注入模块中进行细化,并与全局特征联合优化,以构建鲁棒性的车辆特征。为了验证提出方法的有效性,在Ve⁃Ri776、VehicleID和VeRi-Wild数据集上分别进行了实验验证。实验结果证明,本文的方法取得了良好的效果。 展开更多
关键词 车辆重识别 vision Transformer 卷积神经网络 目标导向映射 通道感知
下载PDF
基于Vision Transformer的永磁同步电机故障智能诊断
12
作者 蒋亦悦 卞东石 +1 位作者 焦世琪 张晓飞 《微电机》 2024年第10期20-25,共6页
针对电机运行过程中故障信号数据量少的问题,本文提出了一种基于Vision Transformer的永磁同步电机智能故障诊断方法。该方法首先通过格拉姆矩阵(Gram)、相对位置矩阵(RPM)方法将传感器获取的一维时序信号数据转换为二维图像数据,然后... 针对电机运行过程中故障信号数据量少的问题,本文提出了一种基于Vision Transformer的永磁同步电机智能故障诊断方法。该方法首先通过格拉姆矩阵(Gram)、相对位置矩阵(RPM)方法将传感器获取的一维时序信号数据转换为二维图像数据,然后将矩阵图像数据作为ViT-B/16网络的输入进行故障诊断。经过实验验证,该方法能够对永磁同步电机正常、轴承故障、退磁故障等8种状态进行识别和分类,其中使用Gram矩阵图像作为该方法输入的准确率达到99.2%,使用RPM矩阵图像作为输入准确率达到99.6%,均高于AlexNet、VGG16、ResNet等卷积网络的故障分类准确度,证明该方法可有效提高永磁同步电机故障诊断的准确度。 展开更多
关键词 二维图像 vision Transformer 电机故障诊断
下载PDF
FPGA and computer-vision-based atom tracking technology for scanning probe microscopy
13
作者 俞风度 刘利 +5 位作者 王肃珂 张新彪 雷乐 黄远志 马瑞松 郇庆 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第5期76-85,共10页
Atom tracking technology enhanced with innovative algorithms has been implemented in this study,utilizing a comprehensive suite of controllers and software independently developed domestically.Leveraging an on-board f... Atom tracking technology enhanced with innovative algorithms has been implemented in this study,utilizing a comprehensive suite of controllers and software independently developed domestically.Leveraging an on-board field-programmable gate array(FPGA)with a core frequency of 100 MHz,our system facilitates reading and writing operations across 16 channels,performing discrete incremental proportional-integral-derivative(PID)calculations within 3.4 microseconds.Building upon this foundation,gradient and extremum algorithms are further integrated,incorporating circular and spiral scanning modes with a horizontal movement accuracy of 0.38 pm.This integration enhances the real-time performance and significantly increases the accuracy of atom tracking.Atom tracking achieves an equivalent precision of at least 142 pm on a highly oriented pyrolytic graphite(HOPG)surface under room temperature atmospheric conditions.Through applying computer vision and image processing algorithms,atom tracking can be used when scanning a large area.The techniques primarily consist of two algorithms:the region of interest(ROI)-based feature matching algorithm,which achieves 97.92%accuracy,and the feature description-based matching algorithm,with an impressive 99.99%accuracy.Both implementation approaches have been tested for scanner drift measurements,and these technologies are scalable and applicable in various domains of scanning probe microscopy with broad application prospects in the field of nanoengineering. 展开更多
关键词 atom tracking FPGA computer vision drift measurement
下载PDF
基于改进Vision Transformer的蝴蝶品种分类
14
作者 许翔 蒲智 +1 位作者 鲁文蕊 王亚波 《电脑知识与技术》 2024年第16期1-5,共5页
蝴蝶作为一种品类繁多且相似度极高的生物,具有重要的生态环境感知功能。不同品类蝴蝶对环境变化的敏感程度各不相同,因此在农学与生物学研究方向上对蝴蝶的研究具有十分重要的意义。近年来,计算机视觉技术的飞速发展为快速识别蝴蝶品... 蝴蝶作为一种品类繁多且相似度极高的生物,具有重要的生态环境感知功能。不同品类蝴蝶对环境变化的敏感程度各不相同,因此在农学与生物学研究方向上对蝴蝶的研究具有十分重要的意义。近年来,计算机视觉技术的飞速发展为快速识别蝴蝶品类提供了强有力的技术支持。然而,传统的Vision Transformer模型存在着一些问题,例如缺乏卷积所具有的归纳偏置、局部信息提取能力不足、容易过拟合以及在小数据集上训练缓慢等。针对这些问题,提出了一种基于Vision Transformer改进的蝴蝶分类算法。引入VanillaNet卷积结构,并通过全局注意力机制改进了Class token的更新方式。实验结果显示,在100类蝴蝶数据集上,改进后的Vision Transformer模型的Top-1准确率达到了94.87%,比改进前提升了28.9%。在使用改进的Class token后,算法的Top-1准确率进一步提升至96.64%,相比改进前提升了30.44%。与原网络模型相比,改进后的模型更适用于蝴蝶品种分类任务。 展开更多
关键词 蝴蝶分类 vision Transformer 卷积 Class token VanillaNet 注意力机制
下载PDF
Development and validation of a novel questionnaire regarding vision screening among preschool teachers in Malaysia
15
作者 Shazrina Ariffin Saadah Mohamed Akhir Sumithira Narayanasamy 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第6期1102-1109,共8页
AIM:To develop and evaluate the validity and reliability of a knowledge,attitude,and practice questionnaire related to vision screening(KAP-VST)among preschool teachers in Malaysia.METHODS:The questionnaire was develo... AIM:To develop and evaluate the validity and reliability of a knowledge,attitude,and practice questionnaire related to vision screening(KAP-VST)among preschool teachers in Malaysia.METHODS:The questionnaire was developed through a literature review and discussions with experts.Content and face validation were conducted by a panel of experts(n=10)and preschool teachers(n=10),respectively.A pilot study was conducted for construct validation(n=161)and test-retest reliability(n=60)of the newly developed questionnaire.RESULTS:Based on the content and face validation,71 items were generated,and 68 items were selected after exploratory factor analysis.The content validity index for items(I-CVI)score ranged from 0.8-1.0,and the content validity index for scale(S-CVI)/Ave was 0.99.Internal consistency was KR^(2)0=0.93 for knowledge,Cronbach’s alpha=0.758 for attitude,and Cronbach’s alpha=0.856 for practice.CONCLUSION:The KAP-VST is a valid and reliable instrument for assessing knowledge,attitude,and practice in relation to vision screening among preschool teachers in Malaysia. 展开更多
关键词 validity RELIABILITY preschool teachers vision screening QUESTIONNAIRE
下载PDF
Highly Efficient Back‑End‑of‑Line Compatible Flexible Si‑Based Optical Memristive Crossbar Array for Edge Neuromorphic Physiological Signal Processing and Bionic Machine Vision
16
作者 Dayanand Kumar Hanrui Li +5 位作者 Dhananjay D.Kumbhar Manoj Kumar Rajbhar Uttam Kumar Das Abdul Momin Syed Georgian Melinte Nazek El‑Atab 《Nano-Micro Letters》 SCIE EI CAS CSCD 2024年第11期323-339,共17页
The emergence of the Internet-of-Things is anticipated to create a vast market for what are known as smart edge devices,opening numerous opportunities across countless domains,including personalized healthcare and adv... The emergence of the Internet-of-Things is anticipated to create a vast market for what are known as smart edge devices,opening numerous opportunities across countless domains,including personalized healthcare and advanced robotics.Leveraging 3D integration,edge devices can achieve unprecedented miniaturization while simultaneously boosting processing power and minimizing energy consumption.Here,we demonstrate a back-end-of-line compatible optoelectronic synapse with a transfer learning method on health care applications,including electroencephalogram(EEG)-based seizure prediction,electromyography(EMG)-based gesture recognition,and electrocardiogram(ECG)-based arrhythmia detection.With experiments on three biomedical datasets,we observe the classification accuracy improvement for the pretrained model with 2.93%on EEG,4.90%on ECG,and 7.92%on EMG,respectively.The optical programming property of the device enables an ultralow power(2.8×10^(-13) J)fine-tuning process and offers solutions for patient-specific issues in edge computing scenarios.Moreover,the device exhibits impressive light-sensitive characteristics that enable a range of light-triggered synaptic functions,making it promising for neuromorphic vision application.To display the benefits of these intricate synaptic properties,a 5×5 optoelectronic synapse array is developed,effectively simulating human visual perception and memory functions.The proposed flexible optoelectronic synapse holds immense potential for advancing the fields of neuromorphic physiological signal processing and artificial visual systems in wearable applications. 展开更多
关键词 Neuromorphic computing Electrophysiological signal Artificial vision system Image recognition MEMRISTOR
下载PDF
Exploring Deep Learning Methods for Computer Vision Applications across Multiple Sectors:Challenges and Future Trends
17
作者 Narayanan Ganesh Rajendran Shankar +3 位作者 Miroslav Mahdal Janakiraman SenthilMurugan Jasgurpreet Singh Chohan Kanak Kalita 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期103-141,共39页
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot... Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers. 展开更多
关键词 Neural network machine vision classification object detection deep learning
下载PDF
Vision based intelligent traffic light management system using Faster R‐CNN
18
作者 Syed Konain Abbas Muhammad Usman Ghani Khan +4 位作者 Jia Zhu Raheem Sarwar Naif R.Aljohani Ibrahim A.Hameed Muhammad Umair Hassan 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第4期932-947,共16页
Transportation systems primarily depend on vehicular flow on roads. Developed coun-tries have shifted towards automated signal control, which manages and updates signal synchronisation automatically. In contrast, traf... Transportation systems primarily depend on vehicular flow on roads. Developed coun-tries have shifted towards automated signal control, which manages and updates signal synchronisation automatically. In contrast, traffic in underdeveloped countries is mainly governed by manual traffic light systems. These existing manual systems lead to numerous issues, wasting substantial resources such as time, energy, and fuel, as they cannot make real‐time decisions. In this work, we propose an algorithm to determine traffic signal durations based on real‐time vehicle density, obtained from live closed circuit television camera feeds adjacent to traffic signals. The algorithm automates the traffic light system, making decisions based on vehicle density and employing Faster R‐CNN for vehicle detection. Additionally, we have created a local dataset from live streams of Punjab Safe City cameras in collaboration with the local police authority. The proposed algorithm achieves a class accuracy of 96.6% and a vehicle detection accuracy of 95.7%. Across both day and night modes, our proposed method maintains an average precision, recall, F1 score, and vehicle detection accuracy of 0.94, 0.98, 0.96 and 0.95, respectively. Our proposed work surpasses all evaluation metrics compared to state‐of‐the‐art methodologies. 展开更多
关键词 access control artificial intelligence computer vision intelligent control
下载PDF
Dynamic Vision-Based Machinery Fault Diagnosis With Cross-Modality Feature Alignment
19
作者 Xiang Li Shupeng Yu +2 位作者 Yaguo Lei Naipeng Li Bin Yang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第10期2068-2081,共14页
Intelligent machinery fault diagnosis methods have been popularly and successfully developed in the past decades,and the vibration acceleration data collected by contact accelerometers have been widely investigated.In... Intelligent machinery fault diagnosis methods have been popularly and successfully developed in the past decades,and the vibration acceleration data collected by contact accelerometers have been widely investigated.In many industrial scenarios,contactless sensors are more preferred.The event camera is an emerging bio-inspired technology for vision sensing,which asynchronously records per-pixel brightness change polarity with high temporal resolution and low latency.It offers a promising tool for contactless machine vibration sensing and fault diagnosis.However,the dynamic vision-based methods suffer from variations of practical factors such as camera position,machine operating condition,etc.Furthermore,as a new sensing technology,the labeled dynamic vision data are limited,which generally cannot cover a wide range of machine fault modes.Aiming at these challenges,a novel dynamic vision-based machinery fault diagnosis method is proposed in this paper.It is motivated to explore the abundant vibration acceleration data for enhancing the dynamic vision-based model performance.A crossmodality feature alignment method is thus proposed with deep adversarial neural networks to achieve fault diagnosis knowledge transfer.An event erasing method is further proposed for improving model robustness against variations.The proposed method can effectively identify unseen fault mode with dynamic vision data.Experiments on two rotating machine monitoring datasets are carried out for validations,and the results suggest the proposed method is promising for generalized contactless machinery fault diagnosis. 展开更多
关键词 Condition monitoring domain generalization eventbased camera fault diagnosis machine vision
下载PDF
面向Vision Transformer模型的剪枝技术研究
20
作者 查秉坤 李朋阳 陈小柏 《软件》 2024年第3期83-86,97,共5页
本文针对Vision Transformer(ViT)模型开展剪枝技术研究,探索了多头自注意力机制中的QKV(Query、Key、Value)权重和全连接层(Fully Connected,FC)权重的剪枝问题。针对ViT模型本文提出了3组剪枝方案:只对QKV剪枝、只对FC剪枝以及对QKV... 本文针对Vision Transformer(ViT)模型开展剪枝技术研究,探索了多头自注意力机制中的QKV(Query、Key、Value)权重和全连接层(Fully Connected,FC)权重的剪枝问题。针对ViT模型本文提出了3组剪枝方案:只对QKV剪枝、只对FC剪枝以及对QKV和FC同时进行剪枝,以探究不同剪枝策略对ViT模型准确率和模型参数压缩率的影响。本文开展的研究工作为深度学习模型的压缩和优化提供了重要参考,对于实际应用中的模型精简和性能优化具有指导意义。 展开更多
关键词 vision Transformer模型 剪枝 准确率
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部