期刊文献+
共找到19篇文章
< 1 >
每页显示 20 50 100
Human-pose estimation based on weak supervision
1
作者 Xiaoyan HU Xizhao BAO +1 位作者 Guoli WEI Zhaoyu LI 《Virtual Reality & Intelligent Hardware》 EI 2023年第4期366-377,共12页
Background In computer vision,simultaneously estimating human pose,shape,and clothing is a practical issue in real life,but remains a challenging task owing to the variety of clothing,complexity of de-formation,shorta... Background In computer vision,simultaneously estimating human pose,shape,and clothing is a practical issue in real life,but remains a challenging task owing to the variety of clothing,complexity of de-formation,shortage of large-scale datasets,and difficulty in estimating clothing style.Methods We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape,pose,and clothing deformation.In the first stage,the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body.Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view,obtain a more accurate human posture,and access supervisory information easily.In the second stage,clothing is represented by a PCA-based model that uses two-dimensional key points of clothing as supervised information to regress the parameters.In the third stage,we predefine an embedding graph for each type of clothing to describe the deformation.Then,the mask information of the clothing is used to further adjust the deformation of the clothing.To facilitate training,we constructed a multi-view synthetic dataset that included BCNet and SURREAL.Results The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information.Because this study uses only weakly supervised information,which is much easier to obtain,it has the advantage of utilizing existing data as training data.Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information,compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.Conclusions Our weak supervision method can accurately estimate human body size,pose,and several common types of clothing and overcome the issues of the current shortage of clothing data. 展开更多
关键词 Human pose estimation Clothing estimation weak supervision
下载PDF
Local saliency consistency-based label inference for weakly supervised salient object detection using scribble annotations
2
作者 Shuo Zhao Peng Cui +1 位作者 Jing Shen Haibo Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第1期239-249,共11页
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully superv... Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results. 展开更多
关键词 label inference salient object detection weak supervision
下载PDF
Weakly Supervised Network with Scribble-Supervised and Edge-Mask for Road Extraction from High-Resolution Remote Sensing Images
3
作者 Supeng Yu Fen Huang Chengcheng Fan 《Computers, Materials & Continua》 SCIE EI 2024年第4期549-562,共14页
Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous human... Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods. 展开更多
关键词 Semantic segmentation road extraction weakly supervised learning scribble supervision remote sensing image
下载PDF
Deep Learning Models Based on Weakly Supervised Learning and Clustering Visualization for Disease Diagnosis
4
作者 Jingyao Liu Qinghe Feng +4 位作者 Jiashi Zhao Yu Miao Wei He Weili Shi Zhengang Jiang 《Computers, Materials & Continua》 SCIE EI 2023年第9期2649-2665,共17页
The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions va... The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions vary widely among individuals,making it challenging to accurately diagnose the disease.This study proposed a deep-learning disease diagnosismodel based onweakly supervised learning and clustering visualization(W_CVNet)that fused classification with segmentation.First,the data were preprocessed.An optimizable weakly supervised segmentation preprocessing method(O-WSSPM)was used to remove redundant data and solve the category imbalance problem.Second,a deep-learning fusion method was used for feature extraction and classification recognition.A dual asymmetric complementary bilinear feature extraction method(D-CBM)was used to fully extract complementary features,which solved the problem of insufficient feature extraction by a single deep learning network.Third,an unsupervised learning method based on Fuzzy C-Means(FCM)clustering was used to segment and visualize COVID-19 lesions enabling physicians to accurately assess lesion distribution and disease severity.In this study,5-fold cross-validation methods were used,and the results showed that the network had an average classification accuracy of 85.8%,outperforming six recent advanced classification models.W_CVNet can effectively help physicians with automated aid in diagnosis to determine if the disease is present and,in the case of COVID-19 patients,to further predict the area of the lesion. 展开更多
关键词 CLASSIFICATION COVID-19 deep learning SEGMENTATION unsupervised learning weakly supervised
下载PDF
Weakly Supervised Abstractive Summarization with Enhancing Factual Consistency for Chinese Complaint Reports
5
作者 Ren Tao Chen Shuang 《Computers, Materials & Continua》 SCIE EI 2023年第6期6201-6217,共17页
A large variety of complaint reports reflect subjective information expressed by citizens.A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary.Therefore... A large variety of complaint reports reflect subjective information expressed by citizens.A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary.Therefore,in this paper,a simple and weakly supervised framework considering factual consistency is proposed to generate a summary of city-based complaint reports without pre-labeled sentences/words.Furthermore,it considers the importance of entity in complaint reports to ensure factual consistency of summary.Experimental results on the customer review datasets(Yelp and Amazon)and complaint report dataset(complaint reports of Shenyang in China)show that the proposed framework outperforms state-of-the-art approaches in ROUGE scores and human evaluation.It unveils the effectiveness of our approach to helping in dealing with complaint reports. 展开更多
关键词 Automatic summarization abstractive summarization weakly supervised training entity recognition
下载PDF
Meibomian glands segmentation in infrared images with limited annotation
6
作者 Jia-Wen Lin Ling-Jie Lin +5 位作者 Feng Lu Tai-Chen Lai Jing Zou Lin-Ling Guo Zhi-Ming Lin Li Li 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2024年第3期401-407,共7页
●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS... ●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction. 展开更多
关键词 infrared meibomian glands images meibomian gland dysfunction meibomian glands segmentation weak supervision scribbled annotation
下载PDF
Scribble-Supervised Video Object Segmentation 被引量:3
7
作者 Peiliang Huang Junwei Han +2 位作者 Nian Liu Jun Ren Dingwen Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第2期339-353,共15页
Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to ... Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations. 展开更多
关键词 Convolutional neural networks(CNNs) SCRIBBLE self-attention video object segmentation weakly supervised
下载PDF
Enhancing action discrimination via category-specific frame clustering for weakly-supervised temporal action localization
8
作者 Huifen XIA Yongzhao ZHAN +1 位作者 Honglin LIU Xiaopeng REN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第6期809-823,共15页
Temporal action localization (TAL) is a task of detecting the start and end timestamps of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing w... Temporal action localization (TAL) is a task of detecting the start and end timestamps of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing weakly-supervised TAL (W-TAL) methods with only video-level labels cannot provide sufficient supervision. Single-frame supervision has attracted the interest of researchers. Existing paradigms model single-frame annotations from the perspective of video snippet sequences, neglect action discrimination of annotated frames, and do not pay sufficient attention to their correlations in the same category. Considering a category, the annotated frames exhibit distinctive appearance characteristics or clear action patterns.Thus, a novel method to enhance action discrimination via category-specific frame clustering for W-TAL is proposed. Specifically,the K-means clustering algorithm is employed to aggregate the annotated discriminative frames of the same category, which are regarded as exemplars to exhibit the characteristics of the action category. Then, the class activation scores are obtained by calculating the similarities between a frame and exemplars of various categories. Category-specific representation modeling can provide complimentary guidance to snippet sequence modeling in the mainline. As a result, a convex combination fusion mechanism is presented for annotated frames and snippet sequences to enhance the consistency properties of action discrimination,which can generate a robust class activation sequence for precise action classification and localization. Due to the supplementary guidance of action discriminative enhancement for video snippet sequences, our method outperforms existing single-frame annotation based methods. Experiments conducted on three datasets (THUMOS14, GTEA, and BEOID) show that our method achieves high localization performance compared with state-of-the-art methods. 展开更多
关键词 weakly supervised Temporal action localization Single-frame annotation Category-specific Action discrimination
原文传递
Active self-training for weakly supervised 3D scene semantic segmentation
9
作者 Gengxin Liu Oliver van Kaick +1 位作者 Hui Huang Ruizhen Hu 《Computational Visual Media》 SCIE EI CSCD 2024年第3期425-438,共14页
Since the preparation of labeled datafor training semantic segmentation networks of pointclouds is a time-consuming process, weakly supervisedapproaches have been introduced to learn fromonly a small fraction of data.... Since the preparation of labeled datafor training semantic segmentation networks of pointclouds is a time-consuming process, weakly supervisedapproaches have been introduced to learn fromonly a small fraction of data. These methods aretypically based on learning with contrastive losses whileautomatically deriving per-point pseudo-labels from asparse set of user-annotated labels. In this paper, ourkey observation is that the selection of which samplesto annotate is as important as how these samplesare used for training. Thus, we introduce a methodfor weakly supervised segmentation of 3D scenes thatcombines self-training with active learning. Activelearning selects points for annotation that are likelyto result in improvements to the trained model, whileself-training makes efficient use of the user-providedlabels for learning the model. We demonstrate thatour approach leads to an effective method that providesimprovements in scene segmentation over previouswork and baselines, while requiring only a few userannotations. 展开更多
关键词 semantic segmentation weakly supervised SELF-TRAINING active learning
原文传递
Weakly Supervised Object Localization with Background Suppression Erasing for Art Authentication and Copyright Protection
10
作者 Chaojie Wu Mingyang Li +3 位作者 Ying Gao Xinyan Xie Wing W.Y.Ng Ahmad Musyafa 《Machine Intelligence Research》 EI CSCD 2024年第1期89-103,共15页
The problem of art forgery and infringement is becoming increasingly prominent,since diverse self-media contents with all kinds of art pieces are released on the Internet every day.For art paintings,object detection a... The problem of art forgery and infringement is becoming increasingly prominent,since diverse self-media contents with all kinds of art pieces are released on the Internet every day.For art paintings,object detection and localization provide an efficient and ef-fective means of art authentication and copyright protection.However,the acquisition of a precise detector requires large amounts of ex-pensive pixel-level annotations.To alleviate this,we propose a novel weakly supervised object localization(WSOL)with background su-perposition erasing(BSE),which recognizes objects with inexpensive image-level labels.First,integrated adversarial erasing(IAE)for vanilla convolutional neural network(CNN)dropouts the most discriminative region by leveraging high-level semantic information.Second,a background suppression module(BSM)limits the activation area of the IAE to the object region through a self-guidance mechanism.Finally,in the inference phase,we utilize the refined importance map(RIM)of middle features to obtain class-agnostic loc-alization results.Extensive experiments are conducted on paintings,CUB-200-2011 and ILSVRC to validate the effectiveness of our BSE. 展开更多
关键词 weakly supervised object localization erasing method deep learning computer vision art authentication and copyright protection
原文传递
TwinNet: Twin Structured Knowledge Transfer Network for Weakly Supervised Action Localization 被引量:1
11
作者 Xiao-Yu Zhang Hai-Chao Shi +1 位作者 Chang-Sheng Li Li-Xin Duan 《Machine Intelligence Research》 EI CSCD 2022年第3期227-246,共20页
Action recognition and localization in untrimmed videos is important for many applications and have attracted a lot of attention. Since full supervision with frame-level annotation places an overwhelming burden on man... Action recognition and localization in untrimmed videos is important for many applications and have attracted a lot of attention. Since full supervision with frame-level annotation places an overwhelming burden on manual labeling effort, learning with weak video-level supervision becomes a potential solution. In this paper, we propose a novel weakly supervised framework to recognize actions and locate the corresponding frames in untrimmed videos simultaneously. Considering that there are abundant trimmed videos publicly available and well-segmented with semantic descriptions, the instructive knowledge learned on trimmed videos can be fully leveraged to analyze untrimmed videos. We present an effective knowledge transfer strategy based on inter-class semantic relevance. We also take advantage of the self-attention mechanism to obtain a compact video representation, such that the influence of background frames can be effectively eliminated. A learning architecture is designed with twin networks for trimmed and untrimmed videos, to facilitate transferable self-attentive representation learning. Extensive experiments are conducted on three untrimmed benchmark datasets (i.e., THUMOS14, ActivityNet1.3, and MEXaction2), and the experimental results clearly corroborate the efficacy of our method. It is especially encouraging to see that the proposed weakly supervised method even achieves comparable results to some fully supervised methods. 展开更多
关键词 Knowledge transfer weakly supervised learning self-attention mechanism representation learning action localization
原文传递
Weakly supervised action anticipation without object annotations
12
作者 Yi ZHONG Jia-Hui PAN +1 位作者 Haoxin LI Wei-Shi ZHENG 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期101-110,共10页
Anticipating future actions without observing any partial videos of future actions plays an important role in action prediction and is also a challenging task.To obtain abundant information for action anticipation,som... Anticipating future actions without observing any partial videos of future actions plays an important role in action prediction and is also a challenging task.To obtain abundant information for action anticipation,some methods integrate multimodal contexts,including scene object labels.However,extensively labelling each frame in video datasets requires considerable effort.In this paper,we develop a weakly supervised method that integrates global motion and local finegrained features from current action videos to predict next action label without the need for specific scene context labels.Specifically,we extract diverse types of local features with weakly supervised learning,including object appearance and human pose representations without ground truth.Moreover,we construct a graph convolutional network for exploiting the inherent relationships of humans and objects under present incidents.We evaluate the proposed model on two datasets,the MPII-Cooking dataset and the EPIC-Kitchens dataset,and we demonstrate the generalizability and effectiveness of our approach for action anticipation. 展开更多
关键词 action anticipation weakly supervised learning relation modelling graph convolutional network
原文传递
Lesion region segmentation via weakly supervised learning
13
作者 Ran Yi Rui Zeng +3 位作者 Yang Weng Minjing Yu Yu-Kun Lai Yong-Jin Liu 《Quantitative Biology》 CSCD 2022年第3期239-252,共14页
Background:Image-based automatic diagnosis of field diseases can help increase crop yields and is of great importance.However,crop lesion regions tend to be scattered and of varying sizes,this along with substantial i... Background:Image-based automatic diagnosis of field diseases can help increase crop yields and is of great importance.However,crop lesion regions tend to be scattered and of varying sizes,this along with substantial intraclass variation and small inter-class variation makes segmentation difficult.Methods:We propose a novel end-to-end system that only requires weak supervision of image-level labels for lesion region segmentation.First,a two-branch network is designed for joint disease classification and seed region generation.The generated seed regions are then used as input to the next segmentation stage where we design to use an encoder-decoder network.Different from previous works that use an encoder in the segmentation network,the encoder-decoder network is critical for our system to successfully segment images with small and scattered regions,which is the major challenge in image-based diagnosis of field diseases.We further propose a novel weakly supervised training strategy for the encoder-decoder semantic segmentation network,making use of the extracted seed regions.Results:Experimental results show that our system achieves better lesion region segmentation results than state of the arts.In addition to crop images,our method is also applicable to general scattered object segmentation.We demonstrate this by extending our framework to work on the PASCAL VOC dataset,which achieves comparable performance with the state-of-the-art DSRG(deep seeded region growing)method.Conclusion:Our method not only outperforms state-of-the-art semantic segmentation methods by a large margin for the lesion segmentation task,but also shows its capability to perform well on more general tasks. 展开更多
关键词 weakly supervised learning lesion segmentation disease detection semantic segmentation AGRICULTURE
原文传递
Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization
14
作者 Wenjun Hui Guanghua Gu Bo Wang 《Machine Intelligence Research》 EI CSCD 2023年第6期923-936,共14页
Weakly supervised object localization mines the pixel-level location information based on image-level annotations.The traditional weakly supervised object localization approaches exploit the last convolutional feature... Weakly supervised object localization mines the pixel-level location information based on image-level annotations.The traditional weakly supervised object localization approaches exploit the last convolutional feature map to locate the discriminative regions with abundant semantics.Although it shows the localization ability of classification network,the process lacks the use of shallow edge and texture features,which cannot meet the requirement of object integrity in the localization task.Thus,we propose a novel shallow feature-driven dual-edges localization(DEL)network,in which dual kinds of shallow edges are utilized to mine entire target object regions.Specifically,we design an edge feature mining(EFM)module to extract the shallow edge details through the similarity measurement between the original class activation map and shallow features.We exploit the EFM module to extract two kinds of edges,named the edge of the shallow feature map and the edge of shallow gradients,for enhancing the edge details of the target object in the last convolutional feature map.The total process is proposed during the inference stage,which does not bring extra training costs.Extensive experiments on both the ILSVRC and CUB-200-2011 datasets show that the DEL method obtains consistency and substantial performance improvements compared with the existing methods. 展开更多
关键词 weakly supervised object localization edge feature mining edge of shallow feature map edge of shallow gradients similarity measurement
原文传递
NLWSNet:a weakly supervised network for visual sentiment analysis in mislabeled web images
15
作者 Luo-yang XUE Qi-rong MAO +1 位作者 Xiao-hua HUANG Jie CHEN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第9期1321-1333,共13页
Large-scale datasets are driving the rapid developments of deep convolutional neural networks for visual sentiment analysis.However,the annotation of large-scale datasets is expensive and time consuming.Instead,it ise... Large-scale datasets are driving the rapid developments of deep convolutional neural networks for visual sentiment analysis.However,the annotation of large-scale datasets is expensive and time consuming.Instead,it iseasy to obtain weakly labeled web images from the Internet.However,noisy labels st.ill lead to seriously degraded performance when we use images directly from the web for training networks.To address this drawback,we propose an end-to-end weakly supervised learning network,which is robust to mislabeled web images.Specifically,the proposed attention module automatically eliminates the distraction of those samples with incorrect labels bv reducing their attention scores in the training process.On the other hand,the special-class activation map module is designed to stimulate the network by focusing on the significant regions from the samples with correct labels in a weakly supervised learning approach.Besides the process of feature learning,applying regularization to the classifier is considered to minimize the distance of those samples within the same class and maximize the distance between different class centroids.Quantitative and qualitative evaluations on well-and mislabeled web image datasets demonstrate that the proposed algorithm outperforms the related methods. 展开更多
关键词 Visual sentiment analysis weakly supervised learning Mislabeled samples Significant sentiment regions
原文传递
Weakly supervised temporal action localization with proxy metric modeling
16
作者 Hongsheng XU Zihan CHEN +3 位作者 Yu ZHANG Xin GENG Siya MI Zhihong YANG 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期63-72,共10页
Temporal localization is crucial for action video recognition.Since the manual annotations are expensive and time-consuming in videos,temporal localization with weak video-level labels is challenging but indispensable... Temporal localization is crucial for action video recognition.Since the manual annotations are expensive and time-consuming in videos,temporal localization with weak video-level labels is challenging but indispensable.In this paper,we propose a weakly-supervised temporal action localization approach in untrimmed videos.To settle this issue,we train the model based on the proxies of each action class.The proxies are used to measure the distances between action segments and different original action features.We use a proxy-based metric to cluster the same actions together and separate actions from backgrounds.Compared with state-of-the-art methods,our method achieved competitive results on the THUMOS14 and ActivityNet1.2 datasets. 展开更多
关键词 temporal action localization weakly supervised videos proxy metric
原文传递
Continuous gradient fusion class activation mapping: segmentation of laser-induced damage on large-aperture optics in dark-field images 被引量:1
17
作者 Yueyue Han Yingyan Huang +5 位作者 Hangcheng Dong Fengdong Chen Fa Zeng Zhitao Peng Qihua Zhu Guodong Liu 《High Power Laser Science and Engineering》 SCIE CAS CSCD 2024年第1期30-41,共12页
Segmenting dark-field images of laser-induced damage on large-aperture optics in high-power laser facilities is challenged by complicated damage morphology, uneven illumination and stray light interference. Fully supe... Segmenting dark-field images of laser-induced damage on large-aperture optics in high-power laser facilities is challenged by complicated damage morphology, uneven illumination and stray light interference. Fully supervised semantic segmentation algorithms have achieved state-of-the-art performance but rely on a large number of pixel-level labels, which are time-consuming and labor-consuming to produce. LayerCAM, an advanced weakly supervised semantic segmentation algorithm, can generate pixel-accurate results using only image-level labels, but its scattered and partially underactivated class activation regions degrade segmentation performance. In this paper, we propose a weakly supervised semantic segmentation method, continuous gradient class activation mapping(CAM) and its nonlinear multiscale fusion(continuous gradient fusion CAM). The method redesigns backpropagating gradients and nonlinearly activates multiscale fused heatmaps to generate more fine-grained class activation maps with an appropriate activation degree for different damage site sizes. Experiments on our dataset show that the proposed method can achieve segmentation performance comparable to that of fully supervised algorithms. 展开更多
关键词 class activation maps laser-induced damage semantic segmentation weakly supervised learning
原文传递
A Novel Divide and Conquer Solution for Long-term Video Salient Object Detection
18
作者 Yun-Xiao Li Cheng-Li-Zhao Chen +2 位作者 Shuai Li Ai-Min Hao Hong Qin 《Machine Intelligence Research》 EI CSCD 2024年第4期684-703,共20页
Recently,a new research trend in our video salient object detection(VSOD)research community has focused on enhancing the detection results via model self-fine-tuning using sparsely mined high-quality keyframes from th... Recently,a new research trend in our video salient object detection(VSOD)research community has focused on enhancing the detection results via model self-fine-tuning using sparsely mined high-quality keyframes from the given sequence.Although such a learning scheme is generally effective,it has a critical limitation,i.e.,the model learned on sparse frames only possesses weak generalization ability.This situation could become worse on“long”videos since they tend to have intensive scene variations.Moreover,in such videos,the keyframe information from a longer time span is less relevant to the previous,which could also cause learning conflict and deteriorate the model performance.Thus,the learning scheme is usually incapable of handling complex pattern modeling.To solve this problem,we propose a divide-and-conquer framework,which can convert a complex problem domain into multiple simple ones.First,we devise a novel background consistency analysis(BCA)which effectively divides the mined frames into disjoint groups.Then for each group,we assign an individual deep model on it to capture its key attribute during the fine-tuning phase.During the testing phase,we design a model-matching strategy,which could dynamically select the best-matched model from those fine-tuned ones to handle the given testing frame.Comprehensive experiments show that our method can adapt severe background appearance variation coupling with object movement and obtain robust saliency detection compared with the previous scheme and the state-of-the-art methods. 展开更多
关键词 Video salient object detection background consistency analysis weakly supervised learning long-term information background shift.
原文传递
Partial Label Learning via Conditional-Label-Aware Disambiguation
19
作者 Peng Ni Su-Yun Zhao +2 位作者 Zhi-Gang Dai Hong Chen Cui-Ping Li 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第3期590-605,共16页
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth label.This paper proposes a unified formula... Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth label.This paper proposes a unified formulation that employs proper label constraints for training models while simultaneously performing pseudo-labeling.Unlike existing partial label learning approaches that only leverage similarities in the feature space without utilizing label constraints,our pseudo-labeling process leverages similarities and differences in the feature space using the same candidate label constraints and then disambiguates noise labels.Extensive experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts on classification prediction. 展开更多
关键词 DISAMBIGUATION partial label learning similarity and dissimilarity weak supervision
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部