期刊文献+
共找到1,882篇文章
< 1 2 95 >
每页显示 20 50 100
FPGA and computer-vision-based atom tracking technology for scanning probe microscopy
1
作者 俞风度 刘利 +5 位作者 王肃珂 张新彪 雷乐 黄远志 马瑞松 郇庆 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第5期76-85,共10页
Atom tracking technology enhanced with innovative algorithms has been implemented in this study,utilizing a comprehensive suite of controllers and software independently developed domestically.Leveraging an on-board f... Atom tracking technology enhanced with innovative algorithms has been implemented in this study,utilizing a comprehensive suite of controllers and software independently developed domestically.Leveraging an on-board field-programmable gate array(FPGA)with a core frequency of 100 MHz,our system facilitates reading and writing operations across 16 channels,performing discrete incremental proportional-integral-derivative(PID)calculations within 3.4 microseconds.Building upon this foundation,gradient and extremum algorithms are further integrated,incorporating circular and spiral scanning modes with a horizontal movement accuracy of 0.38 pm.This integration enhances the real-time performance and significantly increases the accuracy of atom tracking.Atom tracking achieves an equivalent precision of at least 142 pm on a highly oriented pyrolytic graphite(HOPG)surface under room temperature atmospheric conditions.Through applying computer vision and image processing algorithms,atom tracking can be used when scanning a large area.The techniques primarily consist of two algorithms:the region of interest(ROI)-based feature matching algorithm,which achieves 97.92%accuracy,and the feature description-based matching algorithm,with an impressive 99.99%accuracy.Both implementation approaches have been tested for scanner drift measurements,and these technologies are scalable and applicable in various domains of scanning probe microscopy with broad application prospects in the field of nanoengineering. 展开更多
关键词 atom tracking FPGA computer vision drift measurement
下载PDF
Exploring Deep Learning Methods for Computer Vision Applications across Multiple Sectors:Challenges and Future Trends
2
作者 Narayanan Ganesh Rajendran Shankar +3 位作者 Miroslav Mahdal Janakiraman SenthilMurugan Jasgurpreet Singh Chohan Kanak Kalita 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期103-141,共39页
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot... Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers. 展开更多
关键词 Neural network machine vision classification object detection deep learning
下载PDF
Early Detection of Colletotrichum Kahawae Disease in Coffee Cherry Based on Computer Vision Techniques
3
作者 Raveena Selvanarayanan Surendran Rajendran Youseef Alotaibi 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期759-782,共24页
Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease ... Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%. 展开更多
关键词 computer vision coffee berry disease colletotrichum kahawae XG boost shapley additive explanations
下载PDF
A Novel 6G Scalable Blockchain Clustering-Based Computer Vision Character Detection for Mobile Images
4
作者 Yuejie Li Shijun Li 《Computers, Materials & Continua》 SCIE EI 2024年第3期3041-3070,共30页
6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is... 6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is leveraged to enhance computer vision applications’security,trustworthiness,and transparency.With the widespread use of mobile devices equipped with cameras,the ability to capture and recognize Chinese characters in natural scenes has become increasingly important.Blockchain can facilitate privacy-preserving mechanisms in applications where privacy is paramount,such as facial recognition or personal healthcare monitoring.Users can control their visual data and grant or revoke access as needed.Recognizing Chinese characters from images can provide convenience in various aspects of people’s lives.However,traditional Chinese character text recognition methods often need higher accuracy,leading to recognition failures or incorrect character identification.In contrast,computer vision technologies have significantly improved image recognition accuracy.This paper proposed a Secure end-to-end recognition system(SE2ERS)for Chinese characters in natural scenes based on convolutional neural networks(CNN)using 6G technology.The proposed SE2ERS model uses the Weighted Hyperbolic Curve Cryptograph(WHCC)of the secure data transmission in the 6G network with the blockchain model.The data transmission within the computer vision system,with a 6G gradient directional histogram(GDH),is employed for character estimation.With the deployment of WHCC and GDH in the constructed SE2ERS model,secure communication is achieved for the data transmission with the 6G network.The proposed SE2ERS compares the performance of traditional Chinese text recognition methods and data transmission environment with 6G communication.Experimental results demonstrate that SE2ERS achieves an average recognition accuracy of 88%for simple Chinese characters,compared to 81.2%with traditional methods.For complex Chinese characters,the average recognition accuracy improves to 84.4%with our system,compared to 72.8%with traditional methods.Additionally,deploying the WHCC model improves data security with the increased data encryption rate complexity of∼12&higher than the traditional techniques. 展开更多
关键词 6G technology blockchain end-to-end recognition Chinese characters natural scene computer vision algorithms convolutional neural network
下载PDF
A Systematic Review of Computer Vision Techniques for Quality Control in End-of-Line Visual Inspection of Antenna Parts
5
作者 Zia Ullah Lin Qi +2 位作者 E.J.Solteiro Pires Arsénio Reis Ricardo Rodrigues Nunes 《Computers, Materials & Continua》 SCIE EI 2024年第8期2387-2421,共35页
The rapid evolution of wireless communication technologies has underscored the critical role of antennas in ensuring seamless connectivity.Antenna defects,ranging from manufacturing imperfections to environmental wear... The rapid evolution of wireless communication technologies has underscored the critical role of antennas in ensuring seamless connectivity.Antenna defects,ranging from manufacturing imperfections to environmental wear,pose significant challenges to the reliability and performance of communication systems.This review paper navigates the landscape of antenna defect detection,emphasizing the need for a nuanced understanding of various defect types and the associated challenges in visual detection.This review paper serves as a valuable resource for researchers,engineers,and practitioners engaged in the design and maintenance of communication systems.The insights presented here pave the way for enhanced reliability in antenna systems through targeted defect detection measures.In this study,a comprehensive literature analysis on computer vision algorithms that are employed in end-of-line visual inspection of antenna parts is presented.The PRISMA principles will be followed throughout the review,and its goals are to provide a summary of recent research,identify relevant computer vision techniques,and evaluate how effective these techniques are in discovering defects during inspections.It contains articles from scholarly journals as well as papers presented at conferences up until June 2023.This research utilized search phrases that were relevant,and papers were chosen based on whether or not they met certain inclusion and exclusion criteria.In this study,several different computer vision approaches,such as feature extraction and defect classification,are broken down and analyzed.Additionally,their applicability and performance are discussed.The review highlights the significance of utilizing a wide variety of datasets and measurement criteria.The findings of this study add to the existing body of knowledge and point researchers in the direction of promising new areas of investigation,such as real-time inspection systems and multispectral imaging.This review,on its whole,offers a complete study of computer vision approaches for quality control in antenna parts.It does so by providing helpful insights and drawing attention to areas that require additional exploration. 展开更多
关键词 computer vision end-of-line visual inspection of antenna parts machine learning algorithms image processing techniques deep learning models
下载PDF
Clinical Application of Preliminary Breast Cancer Screening for Dense Breasts Using Real-Time AI-Powered Ultrasound with Deep-Learning Computer Vision
6
作者 Zhenzhong Zhou Xueqin Xie +3 位作者 Zongjin Yang Zhongxiong Feng Xiaoling Zheng Qian Huang 《Journal of Clinical and Nursing Research》 2024年第6期36-47,共12页
Objective:We propose a solution that is backed by cloud computing,combines a series of AI neural networks of computer vision;is capable of detecting,highlighting,and locating breast lesions from a live ultrasound vide... Objective:We propose a solution that is backed by cloud computing,combines a series of AI neural networks of computer vision;is capable of detecting,highlighting,and locating breast lesions from a live ultrasound video feed,provides BI-RADS categorizations;and has reliable sensitivity and specificity.Multiple deep-learning models were trained on more than 300,000 breast ultrasound images to achieve object detection and regions of interest classification.The main objective of this study was to determine whether the performance of our Al-powered solution was comparable to that of ultrasound radiologists.Methods:The noninferiority evaluation was conducted by comparing the examination results of the same screening women between our AI-powered solution and ultrasound radiologists with over 10 years of experience.The study lasted for one and a half years and was carried out in the Duanzhou District Women and Children's Hospital,Zhaoqing,China.1,133 females between 20 and 70 years old were selected through convenience sampling.Results:The accuracy,sensitivity,specificity,positive predictive value,and negative predictive value were 93.03%,94.90%,90.71%,92.68%,and 93.48%,respectively.The area under the curve(AUC)for all positives was 0.91569 and the AUC for all negatives was 0.90461.The comparison indicated that the overall performance of the AI system was comparable to that of ultrasound radiologists.Conclusion:This innovative AI-powered ultrasound solution is cost-effective and user-friendly,and could be applied to massive breast cancer screening. 展开更多
关键词 Breast cancer screening ULTRASOUND Lesion detection BI-RADS Deep learning computer vision Cloud computing
下载PDF
Computer Vision-Based Human Body Posture Correction System
7
作者 Yangsen QIU Yukun WANG +2 位作者 Yuchen WU Xinyi QIANG Yunzuo ZHANG 《Mechanical Engineering Science》 2024年第1期1-7,共7页
With the development of technology and the progress of life,more and more people,regardless of entertainment,learning,or work,cannot do without computer desks and cannot put down their mobile phones.Due to prolonged s... With the development of technology and the progress of life,more and more people,regardless of entertainment,learning,or work,cannot do without computer desks and cannot put down their mobile phones.Due to prolonged sitting and often neglecting the importance of posture,incorrect posture can often lead to health problems such as hunchback,lumbar muscle strain,and shoulder and neck pain over time.To address this issue,we designed a computer vision-based human body posture detection system.The system utilizes YOLOv8 technology to accurately locate key points of the human body skeleton,and then analyzes the coordinate positions and depth information of these key points to establish a criterion for distinguishing different postures.With the assistance of an SVM classifier,the system achieves an average recognition rate of 95%.Finally,we successfully deployed the posture detection system on Raspberry Pi hardware and conducted extensive testing.The test results demonstrate that the system can effectively detect various postures and provide real-time reminders to users to correct poor posture,demonstrating good practicality and stability. 展开更多
关键词 computer vision human posture deep learning image processing
下载PDF
Computer vision technology in log volume inspection 被引量:3
8
作者 汪亚明 黄文清 赵匀 《Journal of Forestry Research》 SCIE CAS CSCD 2002年第1期67-70,84,共4页
Log volume inspection is very important in forestry research and paper making engineering. This paper proposed a novel approach based on computer vision technology to cope with log volume inspection. The needed hardwa... Log volume inspection is very important in forestry research and paper making engineering. This paper proposed a novel approach based on computer vision technology to cope with log volume inspection. The needed hardware system was analyzed and the details of the inspection algorithms were given. A fuzzy entropy based on image enhancement algorithm was presented for enhancing the image of the cross-section of log. In many practical applications the cross-section is often partially invisible, and this is the major obstacle for correct inspection. To solve this problem, a robust Hausdorff distance method was proposed to recover the whole cross-section. Experiment results showed that this method was efficient. 展开更多
关键词 Log volume Automatic inspection computer vision Fuzzy entropy Hausdorff distance
下载PDF
Review on the proceeding of automatic seedlings classification by computer vision 被引量:1
9
作者 杨延竹 赵学增 +1 位作者 王伟杰 吴羡 《Journal of Forestry Research》 SCIE CAS CSCD 2002年第3期245-249,252,共5页
The classification of seedlings is important to ensure the viability of seedlings after transplantation and is acknowledged as a key factor in forestation and environmental improvement. Based on numerous papers on aut... The classification of seedlings is important to ensure the viability of seedlings after transplantation and is acknowledged as a key factor in forestation and environmental improvement. Based on numerous papers on automatic seedling classification (ASC), the seedling grading theory, traditional grading methods, the background and the proceeding of ASC techniques are described. The automation of the measurement of seedling morphological characteristics by photoelectric meters and computer vision is studied, and the automatic methods of the current grading systems are described respectively. And the further researches on ASC by computer vision are proposed. 展开更多
关键词 Seedlings classification AUTOMATIon Morphological characteristic computer vision
下载PDF
基于Vision Transformer的虹膜——人脸多特征融合识别研究
10
作者 马滔 陈睿 张博 《中国新技术新产品》 2024年第18期8-10,共3页
为了提高生物特征识别系统的准确性和鲁棒性,本文研究基于计算机视觉的虹膜—人脸多特征融合识别方法。本文对面部图像中虹膜区域进行提取以及预处理,采用对比度增强和归一化操作,加强了特征提取的一致性,提升了图像质量。为了获取丰富... 为了提高生物特征识别系统的准确性和鲁棒性,本文研究基于计算机视觉的虹膜—人脸多特征融合识别方法。本文对面部图像中虹膜区域进行提取以及预处理,采用对比度增强和归一化操作,加强了特征提取的一致性,提升了图像质量。为了获取丰富的深度特征,本文使用Vision Transformer模型对预处理后的虹膜和面部图像进行特征提取。利用多头注意力机制将虹膜和面部的多模态特征信息进行融合,再利用全连接层进行分类识别。试验结果表明,该方法识别性能优秀,识别准确性显著提升。 展开更多
关键词 计算机视觉 vision Transformer 多特征融合 虹膜识别 人脸识别
下载PDF
A VIDEO SPECTRUM SPLITTING ENCODING SCHEME BASED ON HUMAN VISION AND ITS COMPUTER SIMULATION
11
作者 赵宇 李华 +1 位作者 俞斯乐 滕建辅 《Transactions of Tianjin University》 EI CAS 1995年第1期79+76-79,共5页
In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Bas... In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Basically, this involves transmission of low frequency luminance information at full frame rate for good motion rendition and transmission of high frequency luminance signal at reduced frame rate for good detail in static images. 展开更多
关键词 D video encoding discrete wavelet transform human vision computer simulation
下载PDF
Application of Computer Vision Technique to Maize Variety Identification
12
作者 孙钟雷 李宇 何伟 《Agricultural Science & Technology》 CAS 2013年第5期783-786,796,共5页
Variety identification is important for maize breeding, processing and trade. The computer vision technique has been widely applied to maize variety identification. In this paper, computer vision technique has been su... Variety identification is important for maize breeding, processing and trade. The computer vision technique has been widely applied to maize variety identification. In this paper, computer vision technique has been summarized from the following technical aspects including image acquisition, image processing, characteristic parameter extraction, pattern recognition and programming softwares. In addition, the existing problems during the application of this technique to maize variety identification have also been analyzed and its development tendency is forecasted. 展开更多
关键词 Maize variety identification computer vision Image processing Feature extraction Pattern recognition
下载PDF
基于Vision Transformer的小麦病害图像识别算法
13
作者 白玉鹏 冯毅琨 +3 位作者 李国厚 赵明富 周浩宇 侯志松 《中国农机化学报》 北大核心 2024年第2期267-274,共8页
小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,... 小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,并对原始图像进行预处理,建立小麦病害图像识别数据集;然后,基于改进的Vision Transformer构建小麦病害图像识别算法,分析不同迁移学习方式和数据增强对模型识别效果的影响。试验可知,全参数迁移学习和数据增强能明显提高Vision Transformer模型的收敛速度和识别精度。最后,在相同时间条件下,对比Vision Transformer、AlexNet和VGG16算法在相同数据集上的表现。试验结果表明,Vision Transformer模型对3种小麦病害图像的平均识别准确率为96.81%,相较于AlexNet和VGG16模型识别准确率分别提高6.68%和4.94%。 展开更多
关键词 小麦病害 vision Transformer 迁移学习 图像识别 数据增强
下载PDF
Collaborative positioning for swarms:A brief survey of vision,LiDAR and wireless sensors based methods 被引量:1
14
作者 Zeyu Li Changhui Jiang +3 位作者 Xiaobo Gu Ying Xu Feng zhou Jianhui Cui 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期475-493,共19页
As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from bo... As positioning sensors,edge computation power,and communication technologies continue to develop,a moving agent can now sense its surroundings and communicate with other agents.By receiving spatial information from both its environment and other agents,an agent can use various methods and sensor types to localize itself.With its high flexibility and robustness,collaborative positioning has become a widely used method in both military and civilian applications.This paper introduces the basic fundamental concepts and applications of collaborative positioning,and reviews recent progress in the field based on camera,LiDAR(Light Detection and Ranging),wireless sensor,and their integration.The paper compares the current methods with respect to their sensor type,summarizes their main paradigms,and analyzes their evaluation experiments.Finally,the paper discusses the main challenges and open issues that require further research. 展开更多
关键词 Collaborative positioning vision LIDAR Wireless sensors Sensor fusion
下载PDF
Application of Computer Vision Technology in Agriculture 被引量:6
15
作者 黄喜梅 毕建杰 +3 位作者 张楠 丁筱玲 李飞 侯发东 《Agricultural Science & Technology》 CAS 2017年第11期2158-2162,共5页
With the development of image processing technology and computer, computer vision technology has been widely used in the production of agriculture,and has made many important achievements. This paper reviews its-resea... With the development of image processing technology and computer, computer vision technology has been widely used in the production of agriculture,and has made many important achievements. This paper reviews its-research progress on diagnosis of agricultural products, water diagnosis, weed identification,product quality testing and grading, agricultural picking and sorting and other as- pects, and finally put forward its existing problems and prospects for the future. 展开更多
关键词 Image processing computer vision technology Agriculture production PROSPECT
下载PDF
Frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students 被引量:1
16
作者 Jie Cai Wen-Wen Fan +5 位作者 Yun-Hui Zhong Cai-Lan Wen Xiao-Dan Wei Wan-Chen Wei Wan-Yan Xiang Jin-Mao Chen 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第2期374-379,共6页
AIM:To investigate the frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students.METHODS:Totally 158 student volunteers underwent routine visio... AIM:To investigate the frequency and associated factors of accommodation and non-strabismic binocular vision dysfunction among medical university students.METHODS:Totally 158 student volunteers underwent routine vision examination in the optometry clinic of Guangxi Medical University.Their data were used to identify the different types of accommodation and nonstrabismic binocular vision dysfunction and to determine their frequency.Correlation analysis and logistic regression were used to examine the factors associated with these abnormalities.RESULTS:The results showed that 36.71%of the subjects had accommodation and non-strabismic binocular vision issues,with 8.86%being attributed to accommodation dysfunction and 27.85%to binocular abnormalities.Convergence insufficiency(CI)was the most common abnormality,accounting for 13.29%.Those with these abnormalities experienced higher levels of eyestrain(χ2=69.518,P<0.001).The linear correlations were observed between the difference of binocular spherical equivalent(SE)and the index of horizontal esotropia at a distance(r=0.231,P=0.004)and the asthenopia survey scale(ASS)score(r=0.346,P<0.001).Furthermore,the right eye's SE was inversely correlated with the convergence of positive and negative fusion images at close range(r=-0.321,P<0.001),the convergence of negative fusion images at close range(r=-0.294,P<0.001),the vergence facility(VF;r=-0.234,P=0.003),and the set of negative fusion images at far range(r=-0.237,P=0.003).Logistic regression analysis indicated that gender,age,and the difference in right and binocular SE did not influence the emergence of these abnormalities.CONCLUSION:Binocular vision abnormalities are more prevalent than accommodation dysfunction,with CI being the most frequent type.Greater binocular refractive disparity leads to more severe eyestrain symptoms. 展开更多
关键词 optometry clinic non-strabismic binocular vision dysfunction college students convergence insufficiency
下载PDF
基于Vision Transformer与迁移学习的裤装廓形识别与分类
17
作者 应欣 张宁 申思 《丝绸》 CAS CSCD 北大核心 2024年第11期77-83,共7页
针对裤装廓形识别与分类模型的分类不准确问题,文章采用带有自注意力机制的Vision Transformer模型实现裤装廓形图像的分类,对于图片背景等无关信息对廓形识别的干扰,添加自注意力机制,增强有用特征通道。为防止因裤型样本数据集较少产... 针对裤装廓形识别与分类模型的分类不准确问题,文章采用带有自注意力机制的Vision Transformer模型实现裤装廓形图像的分类,对于图片背景等无关信息对廓形识别的干扰,添加自注意力机制,增强有用特征通道。为防止因裤型样本数据集较少产生过拟合问题,可通过迁移学习方法对阔腿裤、喇叭裤、紧身裤、哈伦裤4种裤装廓形进行训练和验证,将改进的Vision Transformer模型与传统CNN模型进行对比实验,验证模型效果。实验结果表明:使用Vision Transformer模型在4种裤装廓形分类上的分类准确率达到97.72%,与ResNet-50和MobileNetV2模型相比均有提升,可为服装廓形的图像分类识别提供有力支撑,在实际服装领域中有较高的使用价值。 展开更多
关键词 裤装廓形 自注意力机制 vision transformer 迁移学习 图像分类 廓形识别
下载PDF
细粒度图像分类上Vision Transformer的发展综述
18
作者 孙露露 刘建平 +3 位作者 王健 邢嘉璐 张越 王晨阳 《计算机工程与应用》 CSCD 北大核心 2024年第10期30-46,共17页
细粒度图像分类(fine-grained image classification,FGIC)一直是计算机视觉领域中的重要问题。与传统图像分类任务相比,FGIC的挑战在于类间对象极其相似,使任务难度进一步增加。随着深度学习的发展,Vision Transformer(ViT)模型在视觉... 细粒度图像分类(fine-grained image classification,FGIC)一直是计算机视觉领域中的重要问题。与传统图像分类任务相比,FGIC的挑战在于类间对象极其相似,使任务难度进一步增加。随着深度学习的发展,Vision Transformer(ViT)模型在视觉领域掀起热潮,并被引入到FGIC任务中。介绍了FGIC任务所面临的挑战,分析了ViT模型及其特性。主要根据模型结构全面综述了基于ViT的FGIC算法,包括特征提取、特征关系构建、特征注意和特征增强四方面内容,对每种算法进行了总结,并分析了它们的优缺点。通过对不同ViT模型在相同公用数据集上进行模型性能比较,以验证它们在FGIC任务上的有效性。最后指出了目前研究的不足,并提出未来研究方向,以进一步探索ViT在FGIC中的潜力。 展开更多
关键词 细粒度图像分类 vision Transformer 特征提取 特征关系构建 特征注意 特征增强
下载PDF
Advances in Computer Vision-Based Civil Infrastructure Inspection and Monitoring 被引量:67
19
作者 Billie F. Spencer Jr. Vedhus Hoskere Yasutaka Narazaki 《Engineering》 SCIE EI 2019年第2期199-222,共24页
Computer vision techniques, in conjunction with acquisition through remote cameras and unmanned aerial vehicles (UAVs), offer promising non-contact solutions to civil infrastructure condition assessment. The ultimate ... Computer vision techniques, in conjunction with acquisition through remote cameras and unmanned aerial vehicles (UAVs), offer promising non-contact solutions to civil infrastructure condition assessment. The ultimate goal of such a system is to automatically and robustly convert the image or video data into actionable information. This paper provides an overview of recent advances in computer vision techniques as they apply to the problem of civil infrastructure condition assessment. In particular, relevant research in the fields of computer vision, machine learning, and structural engineering is presented. The work reviewed is classified into two types: inspection applications and monitoring applications. The inspection applications reviewed include identifying context such as structural components, characterizing local and global visible damage, and detecting changes from a reference image. The monitoring applications discussed include static measurement of strain and displacement, as well as dynamic measurement of displacement for modal analysis. Subsequently, some of the key challenges that persist toward the goal of automated vision-based civil infrastructure and monitoring are presented. The paper concludes with ongoing work aimed at addressing some of these stated challenges. 展开更多
关键词 Structural INSPECTIon and MonITORING Artificial INTELLIGENCE computer vision Machine learning Optical flow
下载PDF
基于Vision Transformer和迁移学习的家庭领域哭声识别
20
作者 王汝旭 王荣燕 +2 位作者 曾科 杨传德 刘超 《智能计算机与应用》 2024年第6期119-126,共8页
针对SVM等传统机器学习算法准确率低和当前使用CNN处理家庭领域哭声识别在不同婴儿间出现泛化能力差的问题,提出了一种基于Vision Transformer和迁移学习的婴儿哭声音频分类算法。首先,为实现数据集样本的扩增,采用了包括梅尔频谱转换... 针对SVM等传统机器学习算法准确率低和当前使用CNN处理家庭领域哭声识别在不同婴儿间出现泛化能力差的问题,提出了一种基于Vision Transformer和迁移学习的婴儿哭声音频分类算法。首先,为实现数据集样本的扩增,采用了包括梅尔频谱转换和数据增强的数据预处理技术,进而达到了增强模型鲁棒性的目的。而后,在微调后的Vision Transformer模型上进行迁移学习训练,同时,训练过程中利用了LookAhead优化器来不断调整模型参数以避免过拟合,最终实验实现了对婴儿哭声音频的自动分类。实验结果表明,本实验模型相比其他深度学习模型具有更高的精确率和更快的收敛速度,同时还能有效地学习到婴儿哭声中更具区分性的特征。可以在新生儿监护、听力筛查和异常检测等领域中发挥重要作用。 展开更多
关键词 vision Transformer模型 婴儿哭声 迁移学习 梅尔频谱图 LOOKAHEAD
下载PDF
上一页 1 2 95 下一页 到第
使用帮助 返回顶部