
基于不均衡样本重构的加权在线贯序极限学习机 被引量:2

Weighted online sequential extreme learning machine based on imbalanced sample-reconstruction
摘要 针对现有学习算法难以有效提高不均衡在线贯序数据中少类样本分类精度的问题,提出一种基于不均衡样本重构的加权在线贯序极限学习机。该算法从提取在线贯序数据的分布特性入手,主要包括离线和在线两个阶段:离线阶段主要采用主曲线构建少类样本的可信区域,并通过对该区域内样本进行过采样,来构建符合样本分布趋势的均衡样本集,进而建立初始模型;而在线阶段则对贯序到达的数据根据训练误差赋予各样本相应权重,同时动态更新网络权值。采用UCI标准数据集和澳门实测气象数据进行实验对比,结果表明,与现有在线贯序极限学习机(OS-ELM)、极限学习机(ELM)和元认知在线贯序极限学习机(MCOS-ELM)相比,所提算法对少类样本的识别能力更高,且所提算法的模型训练时间与其他三种算法相差不大。结果表明在不影响算法复杂度的情况下,所提算法能有效提高少类样本的分类精度。 Many traditional machine learning methods tend to get biased classifier which leads to low classification precision for minor class in imbalanced online sequential data. To improve the classification accuracy of minor class, a new weighted online sequential extreme learning machine based on imbalanced sample-reconstruction was proposed. The algorithm started from exploiting distributed characteristics of online sequential data, and contained two stages. In offline stage, the principal curve was introduced to construct the confidence region, where over-sampling was achieved for minor class to construct the equilibrium sample set which was consistent with the sample distribution trend, and then the initial model was established. In online stage, a new weighted method was proposed to update sample weight dynamically, where the value of weight was related to training error. The proposed method was evaluated on UCI dataset and Macao meteorological data. Compared with the existing methods, such as Online Sequential-Extreme Learning Machine (OS-ELM), Extreme Learning Machine (ELM) and Meta-Cognitive Online Sequential- Extreme Learning Machine (MCOS-ELM), the experimental results show that the proposed method can identify the minor class with a higher ability. Moreover, the training time of the proposed method has not much difference compared with the others, which shows that the proposed method can greatly increase the minor prediction accuracy without affecting the complexity of algorithm.
出处 《计算机应用》 CSCD 北大核心 2015年第6期1605-1610,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(U1204609) 中国博士后科学基金资助项目(2014M550508) 河南省基础与前沿技术研究计划项目(132300410430)
关键词 样本重构 极限学习机 主曲线 过采样 不均衡数据 sample-reconstruction Extreme Learning Machine (ELM) principal curve over-sampling imbalanced data
  • 相关文献



  • 1韩慧,王路,温明,王文渊.不均衡数据集学习中基于初分类的过抽样算法[J].计算机应用,2006,26(8):1894-1897. 被引量:10
  • 2陈斌,冯爱民,陈松灿,李斌.基于单簇聚类的数据描述[J].计算机学报,2007,30(8):1325-1332. 被引量:18
  • 3WANG B X, JAPKOWICZ N. Boosting support vector machines for imbalanced data Sets [ J]. Knowledge and Information Systems, 2010, 25(1): 1-20.
  • 4KANG P, CHO S. EUS SVMs: ensemble of under-sampled SVMs for data imbalance problems [ C]// ICONIP 2006: International Conference on Neural Information Processing, LNCS 4232. Berlin: Springer-Verlag, 2006:837-846.
  • 5KOTSIANTIS S, KANELLOPOULOS D, PINTELAS K. Handling imbalaneed datasets: a review [ J]. GESTS International Transactions on Computer Science and Engineering, 2006, 30(1) :25 -36.
  • 6GAO J, FAN W, HAN J, et al. A general framework for mining concept-drifting data streams with skewed distributions [ C]// SDM2007: Proceedings of 2007 SIAM International Conference on Data Mining. Minneapolis: [ s. n. ], 2007:3 - 14.
  • 7GAO J, DING B, FAN W, et al. Classifying data streams with skewed class distributions and concept drifts [ J]. IEEE Internet Computing, 2008, 12(6): 37-49.
  • 8IMAM T, TING K M, KAMRUZZAMAN J. z-SVM: an SVM for improved classification of imbalanced data [ C]// AI 2006: Ad- vances in Artificial Intelligence, LNCS 4304. Berlin: Springer-Verlag, 2006:264-273.
  • 9CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique [ J]. Journal of Artificial Intelligence Research, 2002, 16:321-357.
  • 10蒋宗礼.人工神经网络导论[M].北京:高等教育出版往.2010.



  • 1LIANG N, HUANG G. A fast and accurate online sequential learn- ing algorithm for feedforward networks [ J]. IEEE Transactions on Neural Networks, 2006, 17(6) : 1411 - 1423.
  • 2YUAN P, MA H, FU H. Hotspot-entropy based data forwarding in opportunistic social networks [ J]. Pervasive and Mobile Computing, 2015, 16, Part A: 136-154.
  • 3HUANG G, ZHOU H, DING X, et al. Extreme learning machine for regression and multielass [ J]. IEEE Transactions on Systems, Man, and Cybernetics- Part B: Cybernetics, 2012, 42(2): 513 -529.
  • 4VONG C-M, IP W-F, WONG P-K, et al. Prediction minority class for suspended particulate matters level by extreme learn- ing machine [J]. Neurocomputing, 2014, 128: 136-144.
  • 5NEWMAN D J, HETrICH S, BLAKE C L, et al. UCI re- pository of machine learning databases, lrvine: University of California, Department of Information and Computer Science [ DB/OL]. [ 2015- 02- 06]. http://archive, ics. uci. edu/ ml/datasets, html? format = &task = cla&att = &area = &numAtt = &numIns = &type = &sort = nameUp&view = ta- ble.
  • 6SMG E-publication download page [ DB/OL]. [2012-03-16]. ht- tp://www, smg. gov. mo/www/ccaa/pdf/e_pdf_download, php.
  • 7杨智明,乔立岩,彭喜元.基于改进SMOTE的不平衡数据挖掘方法研究[J].电子学报,2007,35(B12):22-26. 被引量:31
  • 8曾志强,吴群,廖备水,高济.一种基于核SMOTE的非平衡数据集分类方法[J].电子学报,2009,37(11):2489-2495. 被引量:48
  • 9陈盛双.基于极限学习机的XML文档分类[J].计算机工程,2011,37(19):177-178. 被引量:13
  • 10程苗.基于云计算的Web数据挖掘[J].计算机科学,2011,38(B10):146-149. 被引量:51










使用帮助 返回顶部