期刊文献+
共找到64篇文章
< 1 2 4 >
每页显示 20 50 100
Research on classification method of high myopic maculopathy based on retinal fundus images and optimized ALFA-Mix active learning algorithm 被引量:1
1
作者 Shao-Jun Zhu Hao-Dong Zhan +4 位作者 Mao-Nian Wu Bo Zheng Bang-Quan Liu Shao-Chong Zhang Wei-Hua Yang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2023年第7期995-1004,共10页
AIM:To conduct a classification study of high myopic maculopathy(HMM)using limited datasets,including tessellated fundus,diffuse chorioretinal atrophy,patchy chorioretinal atrophy,and macular atrophy,and minimize anno... AIM:To conduct a classification study of high myopic maculopathy(HMM)using limited datasets,including tessellated fundus,diffuse chorioretinal atrophy,patchy chorioretinal atrophy,and macular atrophy,and minimize annotation costs,and to optimize the ALFA-Mix active learning algorithm and apply it to HMM classification.METHODS:The optimized ALFA-Mix algorithm(ALFAMix+)was compared with five algorithms,including ALFA-Mix.Four models,including Res Net18,were established.Each algorithm was combined with four models for experiments on the HMM dataset.Each experiment consisted of 20 active learning rounds,with 100 images selected per round.The algorithm was evaluated by comparing the number of rounds in which ALFA-Mix+outperformed other algorithms.Finally,this study employed six models,including Efficient Former,to classify HMM.The best-performing model among these models was selected as the baseline model and combined with the ALFA-Mix+algorithm to achieve satisfactor y classification results with a small dataset.RESULTS:ALFA-Mix+outperforms other algorithms with an average superiority of 16.6,14.75,16.8,and 16.7 rounds in terms of accuracy,sensitivity,specificity,and Kappa value,respectively.This study conducted experiments on classifying HMM using several advanced deep learning models with a complete training set of 4252 images.The Efficient Former achieved the best results with an accuracy,sensitivity,specificity,and Kappa value of 0.8821,0.8334,0.9693,and 0.8339,respectively.Therefore,by combining ALFA-Mix+with Efficient Former,this study achieved results with an accuracy,sensitivity,specificity,and Kappa value of 0.8964,0.8643,0.9721,and 0.8537,respectively.CONCLUSION:The ALFA-Mix+algorithm reduces the required samples without compromising accuracy.Compared to other algorithms,ALFA-Mix+outperforms in more rounds of experiments.It effectively selects valuable samples compared to other algorithms.In HMM classification,combining ALFA-Mix+with Efficient Former enhances model performance,further demonstrating the effectiveness of ALFA-Mix+. 展开更多
关键词 high myopic maculopathy deep learning active learning image classification ALFA-Mix algorithm
下载PDF
Active learning accelerated Monte-Carlo simulation based on the modified K-nearest neighbors algorithm and its application to reliability estimations
2
作者 Zhifeng Xu Jiyin Cao +2 位作者 Gang Zhang Xuyong Chen Yushun Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第10期306-313,共8页
This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a rand... This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a random input point can be postulated through a classifier implemented through the modified K-nearest neighbors algorithm.Compared to other active learning methods resorting to experimental designs,the proposed method is characterized by employing Monte-Carlo simulation for sampling inputs and saving a large portion of the actual evaluations of outputs through an accurate classification,which is applicable for most structural reliability estimation problems.Moreover,the validity,efficiency,and accuracy of the proposed method are demonstrated numerically.In addition,the optimal value of K that maximizes the computational efficiency is studied.Finally,the proposed method is applied to the reliability estimation of the carbon fiber reinforced silicon carbide composite specimens subjected to random displacements,which further validates its practicability. 展开更多
关键词 active learning Monte-carlo simulation K-nearest neighbors Reliability estimation CLASSIFICATION
下载PDF
Active Learning Strategies for Textual Dataset-Automatic Labelling
3
作者 Sher Muhammad Daudpota Saif Hassan +2 位作者 Yazeed Alkhurayyif Abdullah Saleh Alqahtani Muhammad Haris Aziz 《Computers, Materials & Continua》 SCIE EI 2023年第8期1409-1422,共14页
The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in ... The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in supervised machine learning is still an expensive process and involves tedious human efforts.The overall purpose of this study is to propose a strategy to automatically label the unlabeled textual data with the support of active learning in combination with deep learning.More specifically,this study assesses the performance of different active learning strategies in automatic labelling of the textual dataset at sentence and document levels.To achieve this objective,different experiments have been performed on the publicly available dataset.In first set of experiments,we randomly choose a subset of instances from training dataset and train a deep neural network to assess performance on test set.In the second set of experiments,we replace the random selection with different active learning strategies to choose a subset of the training dataset to train the same model and reassess its performance on test set.The experimental results suggest that different active learning strategies yield performance improvement of 7% on document level datasets and 3%on sentence level datasets for auto labelling. 展开更多
关键词 active learning automatic labelling textual datasets
下载PDF
Enhancing Semantic Segmentation through Reinforced Active Learning: Combating Dataset Imbalances and Bolstering Annotation Efficiency
4
作者 Dong Han Huong Pham Samuel Cheng 《Journal of Electronic & Information Systems》 2023年第2期45-60,共16页
This research addresses the challenges of training large semantic segmentation models for image analysis,focusing on expediting the annotation process and mitigating imbalanced datasets.In the context of imbalanced da... This research addresses the challenges of training large semantic segmentation models for image analysis,focusing on expediting the annotation process and mitigating imbalanced datasets.In the context of imbalanced datasets,biases related to age and gender in clinical contexts and skewed representation in natural images can affect model performance.Strategies to mitigate these biases are explored to enhance efficiency and accuracy in semantic segmentation analysis.An in-depth exploration of various reinforced active learning methodologies for image segmentation is conducted,optimizing precision and efficiency across diverse domains.The proposed framework integrates Dueling Deep Q-Networks(DQN),Prioritized Experience Replay,Noisy Networks,and Emphasizing Recent Experience.Extensive experimentation and evaluation of diverse datasets reveal both improvements and limitations associated with various approaches in terms of overall accuracy and efficiency.This research contributes to the expansion of reinforced active learning methodologies for image segmentation,paving the way for more sophisticated and precise segmentation algorithms across diverse domains.The findings emphasize the need for a careful balance between exploration and exploitation strategies in reinforcement learning for effective image segmentation. 展开更多
关键词 Semantic segmentation active learning Reinforcement learning
下载PDF
MII:A Novel Text Classification Model Combining Deep Active Learning with BERT 被引量:6
5
作者 Anman Zhang Bohan Li +2 位作者 Wenhuan Wang Shuo Wan Weitong Chen 《Computers, Materials & Continua》 SCIE EI 2020年第6期1499-1514,共16页
Active learning has been widely utilized to reduce the labeling cost of supervised learning.By selecting specific instances to train the model,the performance of the model was improved within limited steps.However,rar... Active learning has been widely utilized to reduce the labeling cost of supervised learning.By selecting specific instances to train the model,the performance of the model was improved within limited steps.However,rare work paid attention to the effectiveness of active learning on it.In this paper,we proposed a deep active learning model with bidirectional encoder representations from transformers(BERT)for text classification.BERT takes advantage of the self-attention mechanism to integrate contextual information,which is beneficial to accelerate the convergence of training.As for the process of active learning,we design an instance selection strategy based on posterior probabilities Margin,Intra-correlation and Inter-correlation(MII).Selected instances are characterized by small margin,low intra-cohesion and high inter-cohesion.We conduct extensive experiments and analytics with our methods.The effect of learner is compared while the effect of sampling strategy and text classification is assessed from three real datasets.The results show that our method outperforms the baselines in terms of accuracy. 展开更多
关键词 active learning instance selection deep neural network text classification
下载PDF
Analyzing Cross-domain Transportation Big Data of New York City with Semi-supervised and Active Learning 被引量:4
6
作者 Huiyu Sun Suzanne McIntosh 《Computers, Materials & Continua》 SCIE EI 2018年第10期1-9,共9页
The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained ... The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained from one domain(e.g.taxi data)applies badly to a different domain(e.g.Uber data).To achieve accurate analyses on a new domain,substantial amounts of data must be available,which limits practical applications.To remedy this,we propose to use semi-supervised and active learning of big data to accomplish the domain adaptation task:Selectively choosing a small amount of datapoints from a new domain while achieving comparable performances to using all the datapoints.We choose the New York City(NYC)transportation data of taxi and Uber as our dataset,simulating different domains with 90%as the source data domain for training and the remaining 10%as the target data domain for evaluation.We propose semi-supervised and active learning strategies and apply it to the source domain for selecting datapoints.Experimental results show that our adaptation achieves a comparable performance of using all datapoints while using only a fraction of them,substantially reducing the amount of data required.Our approach has two major advantages:It can make accurate analytics and predictions when big datasets are not available,and even if big datasets are available,our approach chooses the most informative datapoints out of the dataset,making the process much more efficient without having to process huge amounts of data. 展开更多
关键词 Big data taxi and uber domain adaptation active learning semi-supervised learning
下载PDF
Adversarial Active Learning for Named Entity Recognition in Cybersecurity 被引量:4
7
作者 Tao Li Yongjin Hu +1 位作者 Ankang Ju Zhuoran Hu 《Computers, Materials & Continua》 SCIE EI 2021年第1期407-420,共14页
Owing to the continuous barrage of cyber threats,there is a massive amount of cyber threat intelligence.However,a great deal of cyber threat intelligence come from textual sources.For analysis of cyber threat intellig... Owing to the continuous barrage of cyber threats,there is a massive amount of cyber threat intelligence.However,a great deal of cyber threat intelligence come from textual sources.For analysis of cyber threat intelligence,many security analysts rely on cumbersome and time-consuming manual efforts.Cybersecurity knowledge graph plays a significant role in automatics analysis of cyber threat intelligence.As the foundation for constructing cybersecurity knowledge graph,named entity recognition(NER)is required for identifying critical threat-related elements from textual cyber threat intelligence.Recently,deep neural network-based models have attained very good results in NER.However,the performance of these models relies heavily on the amount of labeled data.Since labeled data in cybersecurity is scarce,in this paper,we propose an adversarial active learning framework to effectively select the informative samples for further annotation.In addition,leveraging the long short-term memory(LSTM)network and the bidirectional LSTM(BiLSTM)network,we propose a novel NER model by introducing a dynamic attention mechanism into the BiLSTM-LSTM encoderdecoder.With the selected informative samples annotated,the proposed NER model is retrained.As a result,the performance of the NER model is incrementally enhanced with low labeling cost.Experimental results show the effectiveness of the proposed method. 展开更多
关键词 Adversarial learning active learning named entity recognition dynamic attention mechanism
下载PDF
Active Learning Improves Nursing Student Clinical Performance in an Academic Institution in Macao 被引量:1
8
作者 Cindy Sin U Leong Lynn B.Clutter 《Chinese Nursing Research》 CAS 2015年第3期108-115,共8页
Objective: To assess the outcome of the application of active learning during practicum among nursing students using clinical assessment and evaluation scores as a measurement. Methods: Nursing students were instruc... Objective: To assess the outcome of the application of active learning during practicum among nursing students using clinical assessment and evaluation scores as a measurement. Methods: Nursing students were instructed on the basics of active learning prior to the initiation of their clinical experience. The participants were divided into 5groups of nursing students ( n = 56) across three levels (years 2-4) in a public academic institute of a bachelor degree program in Macao. Final clinical evaluation was averaged and compared between groups with and without intervention. Results: These nursing students were given higher appraisals in verbal and written comments than previous students without interventian. The groups with the invention achieved higher clinical assessment and evaluation scores on average than comparable groups without the active learning intervention. One group of sophomore nursing students (year 2) did not receive as high of evaluations as the other groups, receiving an average score of above 80. Conclusions" Nursing students must engage in active learning to demonstrate that they are willing to gain knowledge of theory, nursing skills and communication skills during the clinical practicum. 展开更多
关键词 active learning Clinical competence Nursing students
下载PDF
Implementing physically active learning:Future directions for research,policy,and practice
9
作者 Andy Daly-Smith Thomas Quarmby +8 位作者 Victoria S.J.Archbold Ash C.Routen Jade L.Morris Catherine Gammon John B.Bartholomew Geir Kare Resaland Bryn Llewellyn Richard Allman Henry Dorling 《Journal of Sport and Health Science》 SCIE 2020年第1期41-49,F0003,共10页
Purpose'. To identify co-produced multi-stakeholder perspectives important for successful widespread physically active learning (PAL) adoptionand implementation.Methods: A total of 35 stakeholders (policymakers ≪ ... Purpose'. To identify co-produced multi-stakeholder perspectives important for successful widespread physically active learning (PAL) adoptionand implementation.Methods: A total of 35 stakeholders (policymakers ≪ = 9;commercial education sector, ≪ = 8;teachers, w = 3;researchers, w = 15) attended adesign thinking PAL workshop. Participants formed 5 multi-disciplinary groups with at least 1 representative from each stakeholder group. Eachgroup, facilitated by a researcher, undertook 2 tasks: (1) using Post-it Notes, the following question was answered: within the school day, whatare the opportunities for learning combined with movement? and (2) structured as a washing-line task, the following question was answered:how can we establish PAL as the norm? All discussions were audio-recorded and transcribed. Inductive analyses were conducted by 4 authors.After the analyses were complete, the main themes and subthemes were assigned to 4 predetermined categories: (1) PAL design and implementation,(2) priorities for practice, (3) priorities for policy, and (4) priorities for research.Results'. The following were the main themes for PAL implementation: opportunities for PAL within the school day, delivery environments,learning approaches, and the intensity of PAL. The main themes for the priorities for practice included teacher confidence and competence,resources to support delivery, and community of practice. The main themes for the policy for priorities included self-governance, the Office forStandards in Education, Children's Services, and Skill, policy investment in initial teacher training, and curriculum reform. The main themes forthe research priorities included establishing a strong evidence base, school-based PAL implementation, and a whole-systems approach.Conclusion-. The present study is the first to identify PAL implementation factors using a combined multi-stakeholder perspective. To achievewider PAL adoption and implementation, future interventions should be evidence based and address implementation factors at the classroomlevel (e.g., approaches and delivery environments), school level (e.g., comm unties of practice), and policy level (e.g., initial teacher training). 展开更多
关键词 Physical activity Physically active learning POLICY SCHOOL
下载PDF
Mining potential social relationship with active learning in LBSN
10
作者 王海平 Zhang Hong +1 位作者 Wang Yong Bing Jia 《High Technology Letters》 EI CAS 2017年第2期198-202,共5页
Rapid development of local-based social network(LBSN) makes it more convenient for researchers to carry out studies related to social network.Mining potential social relationship in LBSN is the most important one.Trad... Rapid development of local-based social network(LBSN) makes it more convenient for researchers to carry out studies related to social network.Mining potential social relationship in LBSN is the most important one.Traditionally,researchers use topological relation of social network or telecommunication network to mine potential social relationship.But the effect is unsatisfactory as the network can not provide complete information of topological relation.In this work,a new model called PSRMAL is proposed for mining potential social relationships with LBSN.With the model,better performance is obtained and guaranteed,and experiments verify the effectiveness. 展开更多
关键词 data preprocessing feature fusion active learning
下载PDF
A Simple yet Effective Framework for Active Learning to Rank
11
作者 Qingzhong Wang Haifang Li +7 位作者 Haoyi Xiong Wen Wang Jiang Bian Yu Lu Shuaiqiang Wang Zhicong Cheng Dejing Dou Dawei Yin 《Machine Intelligence Research》 EI CSCD 2024年第1期169-183,共15页
While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active... While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active users and responding to billions of queries per day.To handle the diverse query requests from users at the web-scale,Baidu has made tremendous efforts in understanding users'queries,retrieving relevant content from a pool of trillions of webpages,and ranking the most relevant webpages on the top of the res-ults.Among the components used in Baidu search,learning to rank(LTR)plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models.To reduce the costs and time con-sumption of query/webpage labelling,we study the problem of active learning to rank(active LTR)that selects unlabeled queries for an-notation and training in this work.Specifically,we first investigate the criterion-Ranking entropy(RE)characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints,using a query-by-com-mittee(QBC)method.Then,we explore a new criterion namely prediction variances(PV)that measures the variance of prediction res-ults for all relevant webpages under a query.Our empirical studies find that RE may favor low-frequency queries from the pool for la-belling while PV prioritizes high-frequency queries more.Finally,we combine these two complementary criteria as the sample selection strategies for active learning.Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models to achieve higher discounted cumulative gain(i.e.,the relative improvement DCG4=1.38%)with the same budgeted labellingefforts. 展开更多
关键词 SEARCH information retrieval learning to rank active learning query by committee
原文传递
Batch Active Learning for Multispectral and Hyperspectral Image Segmentation Using Similarity Graphs
12
作者 Bohan Chen Kevin Miller +1 位作者 Andrea L.Bertozzi Jon Schwenk 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1013-1033,共21页
Graph learning,when used as a semi-supervised learning(SSL)method,performs well for classification tasks with a low label rate.We provide a graph-based batch active learning pipeline for pixel/patch neighborhood multi... Graph learning,when used as a semi-supervised learning(SSL)method,performs well for classification tasks with a low label rate.We provide a graph-based batch active learning pipeline for pixel/patch neighborhood multi-or hyperspectral image segmentation.Our batch active learning approach selects a collection of unlabeled pixels that satisfy a graph local maximum constraint for the active learning acquisition function that determines the relative importance of each pixel to the classification.This work builds on recent advances in the design of novel active learning acquisition functions(e.g.,the Model Change approach in arXiv:2110.07739)while adding important further developments including patch-neighborhood image analysis and batch active learning methods to further increase the accuracy and greatly increase the computational efficiency of these methods.In addition to improvements in the accuracy,our approach can greatly reduce the number of labeled pixels needed to achieve the same level of the accuracy based on randomly selected labeled pixels. 展开更多
关键词 Image segmentation Graph learning Batch active learning Hyperspectral image
下载PDF
Model Change Active Learning in Graph-Based Semi-supervised Learning
13
作者 Kevin S.Miller Andrea L.Bertozzi 《Communications on Applied Mathematics and Computation》 EI 2024年第2期1270-1298,共29页
Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to bes... Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to best improve performance while limiting the number of new labels."Model Change"active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s).We pair this idea with graph-based semi-supervised learning(SSL)methods,that use the spectrum of the graph Laplacian matrix,which can be truncated to avoid prohibitively large computational and storage costs.We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.We show a variety of multiclass examples that illustrate improved performance over prior state-of-art. 展开更多
关键词 active learning Graph-based methods Semi-supervised learning(SSL) Graph Laplacian
下载PDF
Active Machine Learning for Chemical Engineers:A Bright Future Lies Ahead! 被引量:1
14
作者 Yannick Ureel Maarten R.Dobbelaere +4 位作者 Yi Ouyang Kevin De Ras Maarten K.Sabbe Guy B.Marin Kevin M.Van Geem 《Engineering》 SCIE EI CAS CSCD 2023年第8期23-30,共8页
By combining machine learning with the design of experiments,thereby achieving so-called active machine learning,more efficient and cheaper research can be conducted.Machine learning algorithms are more flexible and a... By combining machine learning with the design of experiments,thereby achieving so-called active machine learning,more efficient and cheaper research can be conducted.Machine learning algorithms are more flexible and are better than traditional design of experiment algorithms at investigating processes spanning all length scales of chemical engineering.While active machine learning algorithms are maturing,their applications are falling behind.In this article,three types of challenges presented by active machine learning—namely,convincing the experimental researcher,the flexibility of data creation,and the robustness of active machine learning algorithms—are identified,and ways to overcome them are discussed.A bright future lies ahead for active machine learning in chemical engineering,thanks to increasing automation and more efficient algorithms that can drive novel discoveries. 展开更多
关键词 active machine learning active learning Bayesian optimization Chemical engineering Design of experiments
下载PDF
Phase prediction for high-entropy alloys using generative adversarial network and active learning based on small datasets 被引量:1
15
作者 CHEN Cun ZHOU HengRu +2 位作者 LONG WeiMin WANG Gang REN JingLi 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2023年第12期3615-3627,共13页
In this paper,a new machine learning(ML)model combining conditional generative adversarial networks(CGANs)and active learning(AL)is proposed to predict the body-centered cubic(BCC)phase,face-centered cubic(FCC)phase,a... In this paper,a new machine learning(ML)model combining conditional generative adversarial networks(CGANs)and active learning(AL)is proposed to predict the body-centered cubic(BCC)phase,face-centered cubic(FCC)phase,and BCC+FCC phase of high-entropy alloys(HEAs).Considering the lack of data,CGANs are introduced for data augmentation,and AL can achieve high prediction accuracy under a small sample size owing to its special sample selection strategy.Therefore,we propose an ML framework combining CGAN and AL to predict the phase of HEAs.The arithmetic optimization algorithm(AOA)is introduced to improve the artificial neural network(ANN).AOA can overcome the problem of falling into the locally optimal solution for the ANN and reduce the number of training iterations.The AOA-optimized ANN model trained by the AL sample selection strategy achieved high prediction accuracy on the test set.To improve the performance and interpretability of the model,domain knowledge is incorporated into the feature selection.Additionally,considering that the proposed method can alleviate the problem caused by the shortage of experimental data,it can be applied to predictions based on small datasets in other fields. 展开更多
关键词 high-entropy alloys phase prediction machine learning conditional generative adversarial networks active learning
原文传递
Distributed Active Partial Label Learning
16
作者 Zhen Xu Weibin Chen 《Intelligent Automation & Soft Computing》 SCIE 2023年第9期2627-2650,共24页
Active learning(AL)trains a high-precision predictor model from small numbers of labeled data by iteratively annotating the most valuable data sample from an unlabeled data pool with a class label throughout the learn... Active learning(AL)trains a high-precision predictor model from small numbers of labeled data by iteratively annotating the most valuable data sample from an unlabeled data pool with a class label throughout the learning process.However,most current AL methods start with the premise that the labels queried at AL rounds must be free of ambiguity,which may be unrealistic in some real-world applications where only a set of candidate labels can be obtained for selected data.Besides,most of the existing AL algorithms only consider the case of centralized processing,which necessitates gathering together all the unlabeled data in one fusion center for selection.Considering that data are collected/stored at different nodes over a network in many real-world scenarios,distributed processing is chosen here.In this paper,the issue of distributed classification of partially labeled(PL)data obtained by a fully decentralized AL method is focused on,and a distributed active partial label learning(dAPLL)algorithm is proposed.Our proposed algorithm is composed of a fully decentralized sample selection strategy and a distributed partial label learning(PLL)algorithm.During the sample selection process,both the uncertainty and representativeness of the data are measured based on the global cluster centers obtained by a distributed clustering method,and the valuable samples are chosen in turn.Meanwhile,using the disambiguation-free strategy,a series of binary classification problems can be constructed,and the corresponding cost-sensitive classifiers can be cooperatively trained in a distributed manner.The experiment results conducted on several datasets demonstrate that the performance of the dAPLL algorithm is comparable to that of the corresponding centralized method and is superior to the existing active PLL(APLL)method in different parameter configurations.Besides,our proposed algorithm outperforms several current PLL methods using the random selection strategy,especially when only small amounts of data are selected to be assigned with the candidate labels. 展开更多
关键词 active learning partial label learning distributed processing disambiguation-free strategy
下载PDF
A Novel Active Learning Method Using SVM for Text Classification 被引量:21
17
作者 Mohamed Goudjil Mouloud Koudil +1 位作者 Mouldi Bedda Noureddine Ghoggali 《International Journal of Automation and computing》 EI CSCD 2018年第3期290-298,共9页
Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data... Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data classification and information retrieval, they require manually labeled data samples in the training stage. However, manual labeling is a time consuming and errorprone task. One possible solution to this issue is to exploit the large number of unlabeled samples that are easily accessible via the internet. This paper presents a novel active learning method for text categorization. The main objective of active learning is to reduce the labeling effort, without compromising the accuracy of classification, by intelligently selecting which samples should be labeled.The proposed method selects a batch of informative samples using the posterior probabilities provided by a set of multi-class SVM classifiers, and these samples are then manually labeled by an expert. Experimental results indicate that the proposed active learning method significantly reduces the labeling effort, while simultaneously enhancing the classification accuracy. 展开更多
关键词 Text categorization active learning support vector machine (SVM) pool-based active learning pairwise coupling.
原文传递
Combining Committee-Based Semi-Supervised Learning and Active Learning 被引量:6
18
作者 Mohamed Farouk Abdel Hady Friedhelm Schwenker 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期681-698,共18页
Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this prob... Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this problem by using unlabeled data together with labeled data in the training process. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by multiple sets of features (views) and these views are sufficient for learning and independent given the class. However, these assumptions axe strong and are not satisfied in many real-world domains. In this paper, a single-view variant of Co-Training, called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classifiers is used instead of redundant and independent views. We introduce a new labeling confidence measure for unlabeled examples based on estimating the local accuracy of the committee members on its neighborhood. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combine the merits of committee-based semi-supervised learning and active learning. The random subspace method is applied on both C4.5 decision trees and 1-nearest neighbor classifiers to construct the diverse ensembles used for semi-supervised learning and active learning. Experiments show that these two combinations can outperform other non committee-based ones. 展开更多
关键词 data mining classification active learning CO-TRAINING semi-supervised learning ensemble learning randomsubspace method decision tree nearest neighbor classifier
原文传递
Multi-label active learning by model guided distribution matching 被引量:4
19
作者 Nengneng GAO Sheng-Jun HUANG Songcan CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第5期845-855,共11页
Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks, In contrast with traditional single-label lear... Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks, In contrast with traditional single-label learning, the cost of la- beling a multi-label example is rather high, thus it becomes an important task to train an effective multi-label learning model with as few labeled examples as possible. Active learning, which actively selects the most valuable data to query their labels, is the most important approach to reduce labeling cost. In this paper, we propose a novel approach MADM for batch mode multi-label active learning. On one hand, MADM exploits representativeness and diversity in both the feature and label space by matching the distribution between labeled and unlabeled data. On the other hand, it tends to query predicted positive instances, which are expected to be more informative than negative ones. Experiments on benchmark datasets demonstrate that the proposed approach can reduce the labeling cost significantly. 展开更多
关键词 multi-label learning batch mode active learning distribution matching
原文传递
Automatic traceability link recovery via active learning 被引量:3
20
作者 Tian-bao DU Guo-hua SHEN +2 位作者 Zhi-qiu HUANG Yao-shen YU De-xiang WU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第8期1217-1225,共9页
Traceability link recovery(TLR)is an important and costly software task that requires humans establish relationships between source and target artifact sets within the same project.Previous research has proposed to es... Traceability link recovery(TLR)is an important and costly software task that requires humans establish relationships between source and target artifact sets within the same project.Previous research has proposed to establish traceability links by machine learning approaches.However,current machine learning approaches cannot be well applied to projects without traceability information(links),because training an effective predictive model requires humans label too many traceability links.To save manpower,we propose a new TLR approach based on active learning(AL),which is called the AL-based approach.We evaluate the AL-based approach on seven commonly used traceability datasets and compare it with an information retrieval based approach and a state-ofthe-art machine learning approach.The results indicate that the AL-based approach outperforms the other two approaches in terms of F-score. 展开更多
关键词 AUTOMATIC Traceability link recovery MANPOWER active learning
原文传递
上一页 1 2 4 下一页 到第
使用帮助 返回顶部