AIM:To conduct a classification study of high myopic maculopathy(HMM)using limited datasets,including tessellated fundus,diffuse chorioretinal atrophy,patchy chorioretinal atrophy,and macular atrophy,and minimize anno...AIM:To conduct a classification study of high myopic maculopathy(HMM)using limited datasets,including tessellated fundus,diffuse chorioretinal atrophy,patchy chorioretinal atrophy,and macular atrophy,and minimize annotation costs,and to optimize the ALFA-Mix active learning algorithm and apply it to HMM classification.METHODS:The optimized ALFA-Mix algorithm(ALFAMix+)was compared with five algorithms,including ALFA-Mix.Four models,including Res Net18,were established.Each algorithm was combined with four models for experiments on the HMM dataset.Each experiment consisted of 20 active learning rounds,with 100 images selected per round.The algorithm was evaluated by comparing the number of rounds in which ALFA-Mix+outperformed other algorithms.Finally,this study employed six models,including Efficient Former,to classify HMM.The best-performing model among these models was selected as the baseline model and combined with the ALFA-Mix+algorithm to achieve satisfactor y classification results with a small dataset.RESULTS:ALFA-Mix+outperforms other algorithms with an average superiority of 16.6,14.75,16.8,and 16.7 rounds in terms of accuracy,sensitivity,specificity,and Kappa value,respectively.This study conducted experiments on classifying HMM using several advanced deep learning models with a complete training set of 4252 images.The Efficient Former achieved the best results with an accuracy,sensitivity,specificity,and Kappa value of 0.8821,0.8334,0.9693,and 0.8339,respectively.Therefore,by combining ALFA-Mix+with Efficient Former,this study achieved results with an accuracy,sensitivity,specificity,and Kappa value of 0.8964,0.8643,0.9721,and 0.8537,respectively.CONCLUSION:The ALFA-Mix+algorithm reduces the required samples without compromising accuracy.Compared to other algorithms,ALFA-Mix+outperforms in more rounds of experiments.It effectively selects valuable samples compared to other algorithms.In HMM classification,combining ALFA-Mix+with Efficient Former enhances model performance,further demonstrating the effectiveness of ALFA-Mix+.展开更多
These problems of nonlinearity, fuzziness and few labeled data were rarely considered in traditional remote sensing image classification. A semi-supervised kernel fuzzy C-means (SSKFCM) algorithm is proposed to over...These problems of nonlinearity, fuzziness and few labeled data were rarely considered in traditional remote sensing image classification. A semi-supervised kernel fuzzy C-means (SSKFCM) algorithm is proposed to overcome these disadvantages of remote sensing image classification in this paper. The SSKFCM algorithm is achieved by introducing a kernel method and semi-supervised learning technique into the standard fuzzy C-means (FCM) algorithm. A set of Beijing-1 micro-satellite's multispectral images are adopted to be classified by several algorithms, such as FCM, kernel FCM (KFCM), semi-supervised FCM (SSFCM) and SSKFCM. The classification results are estimated by corresponding indexes. The results indicate that the SSKFCM algorithm significantly improves the classification accuracy of remote sensing images compared with the others.展开更多
As an indispensable part of process monitoring, the performance of fault classification relies heavily on the sufficiency of process knowledge. However, data labels are always difficult to acquire because of the limit...As an indispensable part of process monitoring, the performance of fault classification relies heavily on the sufficiency of process knowledge. However, data labels are always difficult to acquire because of the limited sampling condition or expensive laboratory analysis, which may lead to deterioration of classification performance.To handle this dilemma, a new semi-supervised fault classification strategy is performed in which enhanced active learning is employed to evaluate the value of each unlabeled sample with respect to a specific labeled dataset.Unlabeled samples with large values will serve as supplementary information for the training dataset. In addition,we introduce several reasonable indexes and criteria, and thus human labeling interference is greatly reduced. Finally,the fault classification effectiveness of the proposed method is evaluated using a numerical example and the Tennessee Eastman process.展开更多
Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this prob...Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this problem by using unlabeled data together with labeled data in the training process. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by multiple sets of features (views) and these views are sufficient for learning and independent given the class. However, these assumptions axe strong and are not satisfied in many real-world domains. In this paper, a single-view variant of Co-Training, called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classifiers is used instead of redundant and independent views. We introduce a new labeling confidence measure for unlabeled examples based on estimating the local accuracy of the committee members on its neighborhood. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combine the merits of committee-based semi-supervised learning and active learning. The random subspace method is applied on both C4.5 decision trees and 1-nearest neighbor classifiers to construct the diverse ensembles used for semi-supervised learning and active learning. Experiments show that these two combinations can outperform other non committee-based ones.展开更多
Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face chal...Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face challenges such as high computational complexity and low classification accuracy.To overcome these limitations,we present a novel approach called Weighted fusion based Cooperative Training Algorithm(W-CTA),which leverages the cooperative training technique and unlabeled data to enhance classification performance.Moreover,we introduce the K-means Cooperative Training Algorithm(km-CTA)to prevent the occurrence of local optima during the training phase.Finally,we conduct various experiments to verify the performance of the proposed methods.Experimental results show that W-CTA and km-CTA are effective and efficient on CIFAR-10 dataset.展开更多
This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised class...This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised classification task. And then a variation of the expectation maximization (EM) algorithm was derived to solve the optimization problem, which leads to an iterative algorithm. Although our method is developed in probabilistic framework, there is no need to make assumption about the specific form of data distribution. Besides, the crucial updating formula has closed form. This method was evaluated for text categorization on two standard datasets, 20 news group and Reuters-21578. Experiments show that our approach outperforms the state-of-the-art graph-based transductive learning methods.展开更多
提出一种面向不平衡数据的主动学习算法Balance adjustment Active Learning(简称Ba-AL).每次迭代结束检查训练集样本平衡度,对不平衡训练集进行聚类并剔除冗余样本,保持训练集的平衡,从而提高分类效果.UCI数据集及真实的遥感影像数据...提出一种面向不平衡数据的主动学习算法Balance adjustment Active Learning(简称Ba-AL).每次迭代结束检查训练集样本平衡度,对不平衡训练集进行聚类并剔除冗余样本,保持训练集的平衡,从而提高分类效果.UCI数据集及真实的遥感影像数据集仿真结果表明,该方法可以获得较好的分类效果,达到目标正确率所需的最少训练样本数更少,算法效率更高,数据利用指标更优越.展开更多
基金Supported by the National Natural Science Foundation of China(No.61906066)the Zhejiang Provincial Philosophy and Social Science Planning Project(No.21NDJC021Z)+4 种基金Shenzhen Fund for Guangdong Provincial High-level Clinical Key Specialties(No.SZGSP014)Sanming Project of Medicine in Shenzhen(No.SZSM202011015)Shenzhen Science and Technology Planning Project(No.KCXFZ20211020163813019)the Natural Science Foundation of Ningbo City(No.202003N4072)the Postgraduate Research and Innovation Project of Huzhou University(No.2023KYCX52)。
文摘AIM:To conduct a classification study of high myopic maculopathy(HMM)using limited datasets,including tessellated fundus,diffuse chorioretinal atrophy,patchy chorioretinal atrophy,and macular atrophy,and minimize annotation costs,and to optimize the ALFA-Mix active learning algorithm and apply it to HMM classification.METHODS:The optimized ALFA-Mix algorithm(ALFAMix+)was compared with five algorithms,including ALFA-Mix.Four models,including Res Net18,were established.Each algorithm was combined with four models for experiments on the HMM dataset.Each experiment consisted of 20 active learning rounds,with 100 images selected per round.The algorithm was evaluated by comparing the number of rounds in which ALFA-Mix+outperformed other algorithms.Finally,this study employed six models,including Efficient Former,to classify HMM.The best-performing model among these models was selected as the baseline model and combined with the ALFA-Mix+algorithm to achieve satisfactor y classification results with a small dataset.RESULTS:ALFA-Mix+outperforms other algorithms with an average superiority of 16.6,14.75,16.8,and 16.7 rounds in terms of accuracy,sensitivity,specificity,and Kappa value,respectively.This study conducted experiments on classifying HMM using several advanced deep learning models with a complete training set of 4252 images.The Efficient Former achieved the best results with an accuracy,sensitivity,specificity,and Kappa value of 0.8821,0.8334,0.9693,and 0.8339,respectively.Therefore,by combining ALFA-Mix+with Efficient Former,this study achieved results with an accuracy,sensitivity,specificity,and Kappa value of 0.8964,0.8643,0.9721,and 0.8537,respectively.CONCLUSION:The ALFA-Mix+algorithm reduces the required samples without compromising accuracy.Compared to other algorithms,ALFA-Mix+outperforms in more rounds of experiments.It effectively selects valuable samples compared to other algorithms.In HMM classification,combining ALFA-Mix+with Efficient Former enhances model performance,further demonstrating the effectiveness of ALFA-Mix+.
基金Supported by the National High Technology Research and Development Programme (No.2007AA12Z227) and the National Natural Science Foundation of China (No.40701146).
文摘These problems of nonlinearity, fuzziness and few labeled data were rarely considered in traditional remote sensing image classification. A semi-supervised kernel fuzzy C-means (SSKFCM) algorithm is proposed to overcome these disadvantages of remote sensing image classification in this paper. The SSKFCM algorithm is achieved by introducing a kernel method and semi-supervised learning technique into the standard fuzzy C-means (FCM) algorithm. A set of Beijing-1 micro-satellite's multispectral images are adopted to be classified by several algorithms, such as FCM, kernel FCM (KFCM), semi-supervised FCM (SSFCM) and SSKFCM. The classification results are estimated by corresponding indexes. The results indicate that the SSKFCM algorithm significantly improves the classification accuracy of remote sensing images compared with the others.
基金Project supported by the National Natural Science Foundation of China (No.61903352)the Natural Science Foundation of Zhejiang Province,China (No.LQ19F030007)+3 种基金the Project of Department of Education of Zhejiang Province,China(No.Y202044960)the China Postdoctoral Science Foundation(No.2020M671721)the Fundamental Research Funds for the Provincial Universities of Zhejiang,China (Nos.2021YW18,2021YW80,and 2022YW96)the Innovative Team Project of Fujian Institute of Metrology,China。
文摘As an indispensable part of process monitoring, the performance of fault classification relies heavily on the sufficiency of process knowledge. However, data labels are always difficult to acquire because of the limited sampling condition or expensive laboratory analysis, which may lead to deterioration of classification performance.To handle this dilemma, a new semi-supervised fault classification strategy is performed in which enhanced active learning is employed to evaluate the value of each unlabeled sample with respect to a specific labeled dataset.Unlabeled samples with large values will serve as supplementary information for the training dataset. In addition,we introduce several reasonable indexes and criteria, and thus human labeling interference is greatly reduced. Finally,the fault classification effectiveness of the proposed method is evaluated using a numerical example and the Tennessee Eastman process.
基金partially supported by the Transregional Collaborative Research Centre SFB/TRR 62 Companion-Technology for Cognitive Technical Systems funded by the German Research Foundation(DFG)supported by a scholarship of the German Academic Exchange Service(DAAD)
文摘Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this problem by using unlabeled data together with labeled data in the training process. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by multiple sets of features (views) and these views are sufficient for learning and independent given the class. However, these assumptions axe strong and are not satisfied in many real-world domains. In this paper, a single-view variant of Co-Training, called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classifiers is used instead of redundant and independent views. We introduce a new labeling confidence measure for unlabeled examples based on estimating the local accuracy of the committee members on its neighborhood. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combine the merits of committee-based semi-supervised learning and active learning. The random subspace method is applied on both C4.5 decision trees and 1-nearest neighbor classifiers to construct the diverse ensembles used for semi-supervised learning and active learning. Experiments show that these two combinations can outperform other non committee-based ones.
基金supported in part by the National Natural Science Foundation of China(NSFC)(Nos.62033010,62102134)in part by the Leading talents of science and technology in the Central Plain of China(No.224200510004)+2 种基金in part by the Key R&D projects in Henan Province,China(No.231111222600)in part by the Aeronautical Science Foundation of China(No.2019460T5001)in part by the Scientific and Technological Innovation Talents of Colleges and Universities in Henan Province,China(No.22HASTIT014).
文摘Recently,the Cooperative Training Algorithm(CTA),a well-known Semi-Supervised Learning(SSL)technique,has garnered significant attention in the field of image classification.However,traditional CTA approaches face challenges such as high computational complexity and low classification accuracy.To overcome these limitations,we present a novel approach called Weighted fusion based Cooperative Training Algorithm(W-CTA),which leverages the cooperative training technique and unlabeled data to enhance classification performance.Moreover,we introduce the K-means Cooperative Training Algorithm(km-CTA)to prevent the occurrence of local optima during the training phase.Finally,we conduct various experiments to verify the performance of the proposed methods.Experimental results show that W-CTA and km-CTA are effective and efficient on CIFAR-10 dataset.
基金supported by the Mechanism Socialist Method and Higher Intelligence Theory of the National Natural Science Fund Projects(60873001)
文摘This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised classification task. And then a variation of the expectation maximization (EM) algorithm was derived to solve the optimization problem, which leads to an iterative algorithm. Although our method is developed in probabilistic framework, there is no need to make assumption about the specific form of data distribution. Besides, the crucial updating formula has closed form. This method was evaluated for text categorization on two standard datasets, 20 news group and Reuters-21578. Experiments show that our approach outperforms the state-of-the-art graph-based transductive learning methods.
文摘提出一种面向不平衡数据的主动学习算法Balance adjustment Active Learning(简称Ba-AL).每次迭代结束检查训练集样本平衡度,对不平衡训练集进行聚类并剔除冗余样本,保持训练集的平衡,从而提高分类效果.UCI数据集及真实的遥感影像数据集仿真结果表明,该方法可以获得较好的分类效果,达到目标正确率所需的最少训练样本数更少,算法效率更高,数据利用指标更优越.