期刊文献+
共找到310篇文章
< 1 2 16 >
每页显示 20 50 100
More Than Lightening:A Self-Supervised Low-Light Image Enhancement Method Capable for Multiple Degradations
1
作者 Han Xu Jiayi Ma +3 位作者 Yixuan Yuan Hao Zhang Xin Tian Xiaojie Guo 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期622-637,共16页
Low-light images suffer from low quality due to poor lighting conditions,noise pollution,and improper settings of cameras.To enhance low-light images,most existing methods rely on normal-light images for guidance but ... Low-light images suffer from low quality due to poor lighting conditions,noise pollution,and improper settings of cameras.To enhance low-light images,most existing methods rely on normal-light images for guidance but the collection of suitable normal-light images is difficult.In contrast,a self-supervised method breaks free from the reliance on normal-light data,resulting in more convenience and better generalization.Existing self-supervised methods primarily focus on illumination adjustment and design pixel-based adjustment methods,resulting in remnants of other degradations,uneven brightness and artifacts.In response,this paper proposes a self-supervised enhancement method,termed as SLIE.It can handle multiple degradations including illumination attenuation,noise pollution,and color shift,all in a self-supervised manner.Illumination attenuation is estimated based on physical principles and local neighborhood information.The removal and correction of noise and color shift removal are solely realized with noisy images and images with color shifts.Finally,the comprehensive and fully self-supervised approach can achieve better adaptability and generalization.It is applicable to various low light conditions,and can reproduce the original color of scenes in natural light.Extensive experiments conducted on four public datasets demonstrate the superiority of SLIE to thirteen state-of-the-art methods.Our code is available at https://github.com/hanna-xu/SLIE. 展开更多
关键词 Color correction low-light image enhancement self-supervised learning.
下载PDF
Road Traffic Monitoring from Aerial Images Using Template Matching and Invariant Features 被引量:1
2
作者 Asifa Mehmood Qureshi Naif Al Mudawi +2 位作者 Mohammed Alonazi Samia Allaoua Chelloug Jeongmin Park 《Computers, Materials & Continua》 SCIE EI 2024年第3期3683-3701,共19页
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit... Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved. 展开更多
关键词 Unmanned Aerial Vehicles(UAV) aerial images dataset object detection object tracking data elimination template matching blob detection SIFT VAID
下载PDF
COVID-19 Classification from X-Ray Images:An Approach to Implement Federated Learning on Decentralized Dataset 被引量:1
3
作者 Ali Akbar Siddique S.M.Umar Talha +3 位作者 M.Aamir Abeer D.Algarni Naglaa F.Soliman Walid El-Shafai 《Computers, Materials & Continua》 SCIE EI 2023年第5期3883-3901,共19页
The COVID-19 pandemic has devastated our daily lives,leaving horrific repercussions in its aftermath.Due to its rapid spread,it was quite difficult for medical personnel to diagnose it in such a big quantity.Patients ... The COVID-19 pandemic has devastated our daily lives,leaving horrific repercussions in its aftermath.Due to its rapid spread,it was quite difficult for medical personnel to diagnose it in such a big quantity.Patients who test positive for Covid-19 are diagnosed via a nasal PCR test.In comparison,polymerase chain reaction(PCR)findings take a few hours to a few days.The PCR test is expensive,although the government may bear expenses in certain places.Furthermore,subsets of the population resist invasive testing like swabs.Therefore,chest X-rays or Computerized Vomography(CT)scans are preferred in most cases,and more importantly,they are non-invasive,inexpensive,and provide a faster response time.Recent advances in Artificial Intelligence(AI),in combination with state-of-the-art methods,have allowed for the diagnosis of COVID-19 using chest x-rays.This article proposes a method for classifying COVID-19 as positive or negative on a decentralized dataset that is based on the Federated learning scheme.In order to build a progressive global COVID-19 classification model,two edge devices are employed to train the model on their respective localized dataset,and a 3-layered custom Convolutional Neural Network(CNN)model is used in the process of training the model,which can be deployed from the server.These two edge devices then communicate their learned parameter and weight to the server,where it aggregates and updates the globalmodel.The proposed model is trained using an image dataset that can be found on Kaggle.There are more than 13,000 X-ray images in Kaggle Database collection,from that collection 9000 images of Normal and COVID-19 positive images are used.Each edge node possesses a different number of images;edge node 1 has 3200 images,while edge node 2 has 5800.There is no association between the datasets of the various nodes that are included in the network.By doing it in this manner,each of the nodes will have access to a separate image collection that has no correlation with each other.The diagnosis of COVID-19 has become considerably more efficient with the installation of the suggested algorithm and dataset,and the findings that we have obtained are quite encouraging. 展开更多
关键词 Artificial intelligence deep learning federated learning COVID-19 decentralized image dataset
下载PDF
RF-Net: Unsupervised Low-Light Image Enhancement Based on Retinex and Exposure Fusion
4
作者 Tian Ma Chenhui Fu +2 位作者 Jiayi Yang Jiehui Zhang Chuyang Shang 《Computers, Materials & Continua》 SCIE EI 2023年第10期1103-1122,共20页
Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propo... Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propose a two-stage unsupervised low-light image enhancement algorithm called Retinex and Exposure Fusion Network(RFNet),which can overcome the problems of over-enhancement of the high dynamic range and under-enhancement of the low dynamic range in existing enhancement algorithms.This algorithm can better manage the challenges brought about by complex environments in real-world scenarios by training with unpaired low-light images and regular-light images.In the first stage,we design a multi-scale feature extraction module based on Retinex theory,capable of extracting details and structural information at different scales to generate high-quality illumination and reflection images.In the second stage,an exposure image generator is designed through the camera response mechanism function to acquire exposure images containing more dark features,and the generated images are fused with the original input images to complete the low-light image enhancement.Experiments show the effectiveness and rationality of each module designed in this paper.And the method reconstructs the details of contrast and color distribution,outperforms the current state-of-the-art methods in both qualitative and quantitative metrics,and shows excellent performance in the real world. 展开更多
关键词 low-light image enhancement multiscale feature extraction module exposure generator exposure fusion
下载PDF
Dataset of Large Gathering Images for Person Identification and Tracking
5
作者 Adnan Nadeem Amir Mehmood +7 位作者 Kashif Rizwan Muhammad Ashraf Nauman Qadeer Ali Alzahrani Qammer H.Abbasi Fazal Noor Majed Alhaisoni Nadeem Mahmood 《Computers, Materials & Continua》 SCIE EI 2023年第3期6065-6080,共16页
This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed ... This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed images reflecting a highly challenging and unconstraint environment.The methodology for building the dataset consists of four core phases;that include acquisition of videos,extraction of frames,localization of face regions,and cropping and resizing of detected face regions.The raw images in the dataset consist of a total of 4613 frames obtained fromvideo sequences.The processed images in the dataset consist of the face regions of 250 persons extracted from raw data images to ensure the authenticity of the presented data.The dataset further consists of 8 images corresponding to each of the 250 subjects(persons)for a total of 2000 images.It portrays a highly unconstrained and challenging environment with human faces of varying sizes and pixel quality(resolution).Since the face regions in video sequences are severely degraded due to various unavoidable factors,it can be used as a benchmark to test and evaluate face detection and recognition algorithms for research purposes.We have also gathered and displayed records of the presence of subjects who appear in presented frames;in a temporal context.This can also be used as a temporal benchmark for tracking,finding persons,activity monitoring,and crowd counting in large crowd scenarios. 展开更多
关键词 Large crowd gatherings a dataset of large crowd images highly uncontrolled environment tracking missing persons face recognition activity monitoring
下载PDF
VLCA: vision-language aligning model with cross-modal attention for bilingual remote sensing image captioning 被引量:1
6
作者 WEI Tingting YUAN Weilin +2 位作者 LUO Junren ZHANG Wanpeng LU Lina 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第1期9-18,共10页
In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a visi... In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a vision-language aligning paradigm for RSIC to jointly represent vision and language. First, a new RSIC dataset DIOR-Captions is built for augmenting object detection in optical remote(DIOR) sensing images dataset with manually annotated Chinese and English contents. Second, a Vision-Language aligning model with Cross-modal Attention(VLCA) is presented to generate accurate and abundant bilingual descriptions for remote sensing images. Third, a crossmodal learning network is introduced to address the problem of visual-lingual alignment. Notably, VLCA is also applied to end-toend Chinese captions generation by using the pre-training language model of Chinese. The experiments are carried out with various baselines to validate VLCA on the proposed dataset. The results demonstrate that the proposed algorithm is more descriptive and informative than existing algorithms in producing captions. 展开更多
关键词 remote sensing image captioning(RSIC) vision-language representation remote sensing image caption dataset attention mechanism
下载PDF
Data Augmentation Using Contour Image for Convolutional Neural Network
7
作者 Seung-Yeon Hwang Jeong-Joon Kim 《Computers, Materials & Continua》 SCIE EI 2023年第6期4669-4680,共12页
With the development of artificial intelligence-related technologies such as deep learning,various organizations,including the government,are making various efforts to generate and manage big data for use in artificia... With the development of artificial intelligence-related technologies such as deep learning,various organizations,including the government,are making various efforts to generate and manage big data for use in artificial intelligence.However,it is difficult to acquire big data due to various social problems and restrictions such as personal information leakage.There are many problems in introducing technology in fields that do not have enough training data necessary to apply deep learning technology.Therefore,this study proposes a mixed contour data augmentation technique,which is a data augmentation technique using contour images,to solve a problem caused by a lack of data.ResNet,a famous convolutional neural network(CNN)architecture,and CIFAR-10,a benchmark data set,are used for experimental performance evaluation to prove the superiority of the proposed method.And to prove that high performance improvement can be achieved even with a small training dataset,the ratio of the training dataset was divided into 70%,50%,and 30%for comparative analysis.As a result of applying the mixed contour data augmentation technique,it was possible to achieve a classification accuracy improvement of up to 4.64%and high accuracy even with a small amount of data set.In addition,it is expected that the mixed contour data augmentation technique can be applied in various fields by proving the excellence of the proposed data augmentation technique using benchmark datasets. 展开更多
关键词 Data augmentation image classification deep learning convolutional neural network mixed contour image benchmark dataset
下载PDF
Deep Learning-Based Digital Image Forgery Detection Using Transfer Learning
8
作者 Emad Ul Haq Qazi Tanveer Zia +1 位作者 Muhammad Imran Muhammad Hamza Faheem 《Intelligent Automation & Soft Computing》 2023年第12期225-240,共16页
Deep learning is considered one of the most efficient and reliable methods through which the legitimacy of a digital image can be verified.In the current cyber world where deepfakes have shaken the global community,co... Deep learning is considered one of the most efficient and reliable methods through which the legitimacy of a digital image can be verified.In the current cyber world where deepfakes have shaken the global community,confirming the legitimacy of a digital image is of great importance.With the advancements made in deep learning techniques,now we can efficiently train and develop state-of-the-art digital image forensic models.The most traditional and widely used method by researchers is convolution neural networks(CNN)for verification of image authenticity but it consumes a considerable number of resources and requires a large dataset for training.Therefore,in this study,a transfer learning based deep learning technique for image forgery detection is proposed.The proposed methodology consists of three modules namely;preprocessing module,convolutional module,and the classification module.By using our proposed technique,the training time is drastically reduced by utilizing the pre-trained weights.The performance of the proposed technique is evaluated by using benchmark datasets,i.e.,BOW and BOSSBase that detect five forensic types which include JPEG compression,contrast enhancement(CE),median filtering(MF),additive Gaussian noise,and resampling.We evaluated the performance of our proposed technique by conducting various experiments and case scenarios and achieved an accuracy of 99.92%.The results show the superiority of the proposed system. 展开更多
关键词 image forgery transfer learning deep learning BOW dataset BOSSBase dataset
下载PDF
A Method of Generating Semi-Experimental Biomedical Datasets
9
作者 Jing Wang Naike Du +1 位作者 Zi He Xiuzhu Ye 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期219-226,共8页
This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which ... This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which is more realistic than synthetical datasets.In this paper,datasets containing different shapes are constructed based on the relative permittivities of human tissues.Then,a back-propagation scheme is used to obtain the rough reconstructions,which will be fed into a U-net convolutional neural network(CNN)to recover the high-resolution images.Numerical results show that the network trained on the datasets generated by the proposed method can obtain satisfying reconstruction results and is promising to be applied in real-time biomedical imaging. 展开更多
关键词 electromagnetic imaging dataset biomedical imaging
下载PDF
基于Moscaic Dateset和Image Service的影像数据发布实现 被引量:1
10
作者 陈瑞 《宁夏工程技术》 CAS 2018年第4期314-316,共3页
在海量遥感影像数据的基础上,基于ArcGIS的镶嵌数据集和栅格地图服务,实现快速、高效地管理大规模影像数据,发布海量影像栅格地图服务,有效提高影像利用率,为海量遥感影像数据管理部门提供借鉴和参考。
关键词 镶嵌数据集 栅格地图服务 海量遥感影像 ARCGIS 影像管理
下载PDF
Image Augmentation-Based Food Recognition with Convolutional Neural Networks 被引量:6
11
作者 Lili Pan Jiaohua Qin +3 位作者 Hao Chen Xuyu Xiang Cong Li Ran Chen 《Computers, Materials & Continua》 SCIE EI 2019年第4期297-313,共17页
Image retrieval for food ingredients is important work,tremendously tiring,uninteresting,and expensive.Computer vision systems have extraordinary advancements in image retrieval with CNNs skills.But it is not feasible... Image retrieval for food ingredients is important work,tremendously tiring,uninteresting,and expensive.Computer vision systems have extraordinary advancements in image retrieval with CNNs skills.But it is not feasible for small-size food datasets using convolutional neural networks directly.In this study,a novel image retrieval approach is presented for small and medium-scale food datasets,which both augments images utilizing image transformation techniques to enlarge the size of datasets,and promotes the average accuracy of food recognition with state-of-the-art deep learning technologies.First,typical image transformation techniques are used to augment food images.Then transfer learning technology based on deep learning is applied to extract image features.Finally,a food recognition algorithm is leveraged on extracted deepfeature vectors.The presented image-retrieval architecture is analyzed based on a smallscale food dataset which is composed of forty-one categories of food ingredients and one hundred pictures for each category.Extensive experimental results demonstrate the advantages of image-augmentation architecture for small and medium datasets using deep learning.The novel approach combines image augmentation,ResNet feature vectors,and SMO classification,and shows its superiority for food detection of small/medium-scale datasets with comprehensive experiments. 展开更多
关键词 image augmentation small-scale dataset deep feature deep learning convolutional neural network
下载PDF
Deep Neural Network with Strip Pooling for Image Classification of Yarn-Dyed Plaid Fabrics 被引量:1
12
作者 Xiaoting Zhang Weidong Gao Ruru Pan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第3期1533-1546,共14页
Historically,yarn-dyed plaid fabrics(YDPFs)have enjoyed enduring popularity with many rich plaid patterns,but production data are still classified and searched only according to production parameters.The process does ... Historically,yarn-dyed plaid fabrics(YDPFs)have enjoyed enduring popularity with many rich plaid patterns,but production data are still classified and searched only according to production parameters.The process does not satisfy the visual needs of sample order production,fabric design,and stock management.This study produced an image dataset for YDPFs,collected from 10,661 fabric samples.The authors believe that the dataset will have significant utility in further research into YDPFs.Convolutional neural networks,such as VGG,ResNet,and DenseNet,with different hyperparameter groups,seemed themost promising tools for the study.This paper reports on the authors’exhaustive evaluation of the YDPF dataset.With an overall accuracy of 88.78%,CNNs proved to be effective in YDPF image classification.This was true even for the low accuracy of Windowpane fabrics,which often mistakenly includes the Prince ofWales pattern.Image classification of traditional patterns is also improved by utilizing the strip pooling model to extract local detail features and horizontal and vertical directions.The strip pooling model characterizes the horizontal and vertical crisscross patterns of YDPFs with considerable success.The proposed method using the strip pooling model(SPM)improves the classification performance on the YDPF dataset by 2.64%for ResNet18,by 3.66%for VGG16,and by 3.54%for DenseNet121.The results reveal that the SPM significantly improves YDPF classification accuracy and reduces the error rate of Windowpane patterns as well. 展开更多
关键词 Yarn-dyed plaid fabric image classification image dataset deep neural network strip pooling model
下载PDF
Robust and High Accuracy Algorithm for Detection of Pupil Images 被引量:1
13
作者 Waleed El Nahal Hatim G.Zaini +2 位作者 Raghad H.Zaini Sherif S.M.Ghoneim Ashraf Mohamed Ali Hassan 《Computers, Materials & Continua》 SCIE EI 2022年第10期33-50,共18页
Recently,many researchers have tried to develop a robust,fast,and accurate algorithm.This algorithm is for eye-tracking and detecting pupil position in many applications such as head-mounted eye tracking,gaze-based hu... Recently,many researchers have tried to develop a robust,fast,and accurate algorithm.This algorithm is for eye-tracking and detecting pupil position in many applications such as head-mounted eye tracking,gaze-based human-computer interaction,medical applications(such as deaf and diabetes patients),and attention analysis.Many real-world conditions challenge the eye appearance,such as illumination,reflections,and occasions.On the other hand,individual differences in eye physiology and other sources of noise,such as contact lenses or make-up.The present work introduces a robust pupil detection algorithm with and higher accuracy than the previous attempts for real-time analytics applications.The proposed circular hough transform with morphing canny edge detection for Pupillometery(CHMCEP)algorithm can detect even the blurred or noisy images by using different filtering methods in the pre-processing or start phase to remove the blur and noise and finally the second filtering process before the circular Hough transform for the center fitting to make sure better accuracy.The performance of the proposed CHMCEP algorithm was tested against recent pupil detection methods.Simulations and results show that the proposed CHMCEP algorithm achieved detection rates of 87.11,78.54,58,and 78 according to´Swirski,ExCuSe,Else,and labeled pupils in the wild(LPW)data sets,respectively.These results show that the proposed approach performs better than the other pupil detection methods by a large margin by providing exact and robust pupil positions on challenging ordinary eye pictures. 展开更多
关键词 Pupil detection eye tracking pupil edge morphing techniques eye images dataset
下载PDF
A Deep Learning Hierarchical Ensemble for Remote Sensing Image Classification
14
作者 Seung-Yeon Hwang Jeong-Joon Kim 《Computers, Materials & Continua》 SCIE EI 2022年第8期2649-2663,共15页
Artificial intelligence,which has recently emerged with the rapid development of information technology,is drawing attention as a tool for solving various problems demanded by society and industry.In particular,convol... Artificial intelligence,which has recently emerged with the rapid development of information technology,is drawing attention as a tool for solving various problems demanded by society and industry.In particular,convolutional neural networks(CNNs),a type of deep learning technology,are highlighted in computer vision fields,such as image classification and recognition and object tracking.Training these CNN models requires a large amount of data,and a lack of data can lead to performance degradation problems due to overfitting.As CNN architecture development and optimization studies become active,ensemble techniques have emerged to perform image classification by combining features extracted from multiple CNN models.In this study,data augmentation and contour image extraction were performed to overcome the data shortage problem.In addition,we propose a hierarchical ensemble technique to achieve high image classification accuracy,even if trained from a small amount of data.First,we trained the UCMerced land use dataset and the contour images for each image on pretrained VGGNet,GoogLeNet,ResNet,DenseNet,and EfficientNet.We then apply a hierarchical ensemble technique to the number of cases in which each model can be deployed.These experiments were performed in cases where the proportion of training datasets was 30%,50%,and 70%,resulting in a performance improvement of up to 4.68%compared to the average accuracy of the entire model. 展开更多
关键词 image classification deep learning CNNS hierarchical ensemble UC-Merced land use dataset contour image
下载PDF
基于FE-P2Pnet的无人机小麦图像麦穗计数方法
15
作者 鲍文霞 苏彪彪 +2 位作者 胡根生 黄承沛 梁栋 《农业机械学报》 EI CAS CSCD 北大核心 2024年第4期155-164,289,共11页
针对无人机图像背景复杂、小麦密集、麦穗目标较小以及麦穗尺寸不一等问题,提出了一种基于FE-P2Pnet(Feature enhance-point to point)的无人机小麦图像麦穗自动计数方法。对无人机图像进行亮度和对比度增强,增大麦穗目标与背景之间的... 针对无人机图像背景复杂、小麦密集、麦穗目标较小以及麦穗尺寸不一等问题,提出了一种基于FE-P2Pnet(Feature enhance-point to point)的无人机小麦图像麦穗自动计数方法。对无人机图像进行亮度和对比度增强,增大麦穗目标与背景之间的差异度,减少叶、秆等复杂背景因素的影响。引入了基于点标注的网络P2Pnet作为基线网络,以解决麦穗密集的问题。同时,针对麦穗目标小引起的特征信息较少的问题,在P2Pnet的主干网络VGG16中添加了Triplet模块,将C(通道)、H(高度)和W(宽度)3个维度的信息交互,使得主干网络可以提取更多与目标相关的特征信息;针对麦穗尺寸不一的问题,在FPN(Feature pyramid networks)上增加了FEM(Feature enhancement module)和SE(Squeeze excitation)模块,使得该模块能够更好地处理特征信息和融合多尺度信息;为了更好地对目标进行分类,使用Focal Loss损失函数代替交叉熵损失函数,该损失函数可以对背景和目标的特征信息进行不同的权重加权,进一步突出特征。实验结果表明,在本文所构建的无人机小麦图像数据集(Wheat-ZWF)上,麦穗计数的平均绝对误差(MAE)、均方误差(MSE)和平均精确度(ACC)分别达到3.77、5.13和90.87%,相较于其他目标计数回归方法如MCNN(Multi-column convolutional neural network)、CSRnet(Congested scene recognition network)和WHCNETs (Wheat head counting networks)等,表现最佳。与基线网络P2Pnet相比,MAE和MSE分别降低23.2%和16.6%,ACC提高2.67个百分点。为了进一步验证本文算法的有效性,对采集的其它4种不同品种的小麦(AK1009、AK1401、AK1706和YKM222)进行了实验,实验结果显示,麦穗计数MAE和MSE平均为5.10和6.17,ACC也达到89.69%,表明本文提出的模型具有较好的泛化性能。 展开更多
关键词 麦穗计数 无人机图像 FE-P2Pnet FEM Wheat-ZWF数据集
下载PDF
改进注意力模型的食品图像识别方法
16
作者 姜枫 周莉莉 《计算机工程与应用》 CSCD 北大核心 2024年第12期153-159,共7页
随着人们对健康饮食需求的日益增加,各种饮食评估辅助软件应运而生,食品图像识别问题受到越来越多的关注。食品图像识别属于细粒度图像识别问题,较其他图像识别难度更大。目前主流的食品图像数据集,如ISIA Food-500、ETH Food-101、Vire... 随着人们对健康饮食需求的日益增加,各种饮食评估辅助软件应运而生,食品图像识别问题受到越来越多的关注。食品图像识别属于细粒度图像识别问题,较其他图像识别难度更大。目前主流的食品图像数据集,如ISIA Food-500、ETH Food-101、Vireo Food-172等所包含的图像数量偏少,难以很好地训练图像识别系统,进一步增大了图像识别难度。提出一种基于注意力机制的图像识别方法,该方法在自注意力的基础上引入局部注意力的概念,用于描绘图像细粒度特征,提高图像识别的准确率。此外,还提出一种图像自监督预训练算法,缓解食品图像训练样本不足的问题。实验结果表明,所提方法在ISIA Food-500数据集的Top-1和Top-5准确率分别达到65.58%和90.03%,性能优于现有的其他算法。 展开更多
关键词 食品图像 细粒度图像识别 局部注意力 自监督预训练 ISIA Food-500数据集
下载PDF
结合沙漏注意力与渐进式混合Transformer的图像分类方法
17
作者 彭晏飞 崔芸 +1 位作者 陈坤 李泳欣 《液晶与显示》 CAS CSCD 北大核心 2024年第9期1223-1232,共10页
Transformer在图像分类任务中具有广泛应用,但在小数据集分类任务中,Transformer受到数据量较少、模型参数量过大等因素的影响,导致分类精度低、收敛速度缓慢。本文提出了一种融合沙漏注意力的渐进式混合Transformer模型。首先,通过下-... Transformer在图像分类任务中具有广泛应用,但在小数据集分类任务中,Transformer受到数据量较少、模型参数量过大等因素的影响,导致分类精度低、收敛速度缓慢。本文提出了一种融合沙漏注意力的渐进式混合Transformer模型。首先,通过下-上采样的沙漏自注意力建模全局特征关系,利用上采样补充下采样操作丢失的信息,同时采用可学习温度参数和负对角掩码锐化注意力的分数分布,避免因层数过多产生过度平滑的现象;其次,设计渐进式下采样模块获得细粒度多尺度特征图,有效捕获低维特征信息;最后,使用混合架构,在顶层阶段使用设计的沙漏注意力,底层阶段使用池化层替代注意力模块,并引入带有深度卷积的层归一化,增加网络局部性。所提方法在T-ImageNet、CIFAR10、CIFAR100、SVHN数据集上进行实验,分类精度可以达到97.42%,计算量和参数量分别为3.41G和25M。实验结果表明,与对比算法相比,该方法的分类精度有明显提升,计算量和参数量有明显降低,提高了Transformer模型在小数据集上的性能表现。 展开更多
关键词 小数据集图像分类 TRANSFORMER 沙漏注意力 多尺度特征 混合架构
下载PDF
基于文字边缘失真特征的翻拍图像篡改定位
18
作者 陈昌盛 陈自炜 李锡劲 《中国科技论文》 CAS 2024年第2期160-168,199,共10页
针对翻拍文档图像的篡改定位问题,提出一种基于文字边缘失真特征的翻拍图像篡改定位方法。从文字边缘分布、边缘梯度以及待检测文本与参考文本在边缘梯度上的差异3个方面构建了文字失真特征,并训练了一个基于深度神经网络的分类器进行... 针对翻拍文档图像的篡改定位问题,提出一种基于文字边缘失真特征的翻拍图像篡改定位方法。从文字边缘分布、边缘梯度以及待检测文本与参考文本在边缘梯度上的差异3个方面构建了文字失真特征,并训练了一个基于深度神经网络的分类器进行决策。同时,为了评估检测方法的性能,构建了一个包含120张合法图像、1 200张翻拍篡改文档图像的数据集。实验结果表明:所提出的方法在跨库实验场景下词汇级别的ROC曲线下面积(area under ROC curve,AUC)和等错误率(equal error rate,EER)分别达到了0.84和0.23;与Forensic Similarity (128×128)和DenseFCN相比,所提出的特征结合LightDenseNet的方法在翻拍篡改文档数据集的跨库协议下,词汇级别的AUC指标分别提高了0.06和0.17。 展开更多
关键词 文档图像 翻拍攻击 篡改定位 文字边缘失真 翻拍篡改文档数据库
下载PDF
基于数据集蒸馏的光伏发电功率超短期预测 被引量:1
19
作者 郑珂 王丽婕 +1 位作者 郝颖 王勃 《中国电机工程学报》 EI CSCD 北大核心 2024年第13期5196-5207,I0015,共13页
云是影响太阳直接辐射变化的主要因素,由于各类云的透光率不同,导致到达光伏电站的太阳辐射会随之产生波动。为解决各类云遮挡下的光伏发电功率波动大、预测模型个数多的问题,提出一种基于卫星云图和数据集蒸馏的光伏发电功率超短期预... 云是影响太阳直接辐射变化的主要因素,由于各类云的透光率不同,导致到达光伏电站的太阳辐射会随之产生波动。为解决各类云遮挡下的光伏发电功率波动大、预测模型个数多的问题,提出一种基于卫星云图和数据集蒸馏的光伏发电功率超短期预测模型。首先,基于待测场站上方的历史云图,采用Farneback光流法预测出云图;然后,根据卫星云分类标签数据建立各类云的样本库,利用数据集蒸馏算法训练样本库得到云类判别图,将预测云图与云类判别图匹配计算,获得云类聚合匹配特征;最后,利用上述特征、云量特征以及数值天气预报数据建立长短期记忆网络模型,对光伏发电功率进行超短期预测。利用某光伏电站数据进行验证,结果显示,该文所提模型能准确描述云层的各项特征,有效提升光伏功率预测精度。 展开更多
关键词 数据集蒸馏 卫星云图 云分类 光流法 超短期光伏功率预测
下载PDF
De-DDPM:可控、可迁移的缺陷图像生成方法
20
作者 岳忠牧 张喆 +2 位作者 吕武 赵瑞祥 马杰 《自动化学报》 EI CAS CSCD 北大核心 2024年第8期1539-1549,共11页
基于深度学习的表面缺陷检测技术是工业上的一项重要应用,而缺陷图像数据集质量对缺陷检测性能有重要影响.为解决实际工业生产过程中缺陷样本获取成本高、缺陷数据量少的痛点,提出了一种基于去噪扩散概率模型(Denoising diffusion proba... 基于深度学习的表面缺陷检测技术是工业上的一项重要应用,而缺陷图像数据集质量对缺陷检测性能有重要影响.为解决实际工业生产过程中缺陷样本获取成本高、缺陷数据量少的痛点,提出了一种基于去噪扩散概率模型(Denoising diffusion probabilistic model,DDPM)的缺陷图像生成方法.该方法在训练过程中加强了模型对缺陷部位和无缺陷背景的差异化学习.在生成过程中通过缺陷控制模块对生成缺陷的类别、形态、显著性等特征进行精准控制,通过背景融合模块,能将缺陷在不同的无缺陷背景上进行迁移,大大降低新背景上缺陷样本的获取难度.实验验证了该模型的缺陷控制和缺陷迁移能力,其生成结果能有效扩充训练数据集,提升下游缺陷检测任务的准确率. 展开更多
关键词 数据增强 数据集扩充 缺陷图像生成 深度学习
下载PDF
上一页 1 2 16 下一页 到第
使用帮助 返回顶部