摘要
提出结合单边抽样Bagging与LPU的基本思想对不平衡数据进行分类。主要步骤是:将未标注实例全标为反类,和正例一起训练单边抽样Bagging学习器,将得到的学习器对未标注实例分类得到可靠的反例(RN),再用正例和RN训练SSBagging学习器。使用Rocchio和EM进行分类是Liu等提出的一种有代表性的LPU。比较了这种LPU和该文提出的方法,发现当数据的不平衡性很明显时,后者要优于前者。
This paper studies how to classify unbalanced data using single side bagging and LPU. The main steps of the classification ate as below: the paper labels all unlabeled data to be negative, and together with the positive data to train single side bagging classifier. After that, it uses the classifier to classify unlabeled data to get reliable negative (RN), then uses the positive data and RN to train SSBagging classifier. As one important t~.,chnique of LPU is using Rocchio method and EM algorithm in the steps, it compares this method of LPU and the proposed method, finding that the latter one is better than the former when it is very distinctive of the unbalancity of data.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2006年第23期216-217,223,共3页
Computer Engineering
基金
国家预研基金资助项目(514950307)