期刊文献+

基于COG-OS框架利用SMART预测云计算平台的硬盘故障 被引量:3

Prediction on hard disk failure of cloud computing framework by using SMART on COG-OS framework
下载PDF
导出
摘要 针对云计算平台的硬盘不可靠问题,提出基于带过采样的COG(COG-OS)框架,利用硬盘自我监测分析和报告技术(SMART)日志预测故障硬盘。首先采用DBScan或K-means聚类算法将无故障硬盘样本划分成多个不相交子集;再与故障硬盘样本结合,采用少量样本合成过采样技术(SMOTE)使整体样本集趋于平衡;最后采用LIBSVM分类算法预测故障硬盘。调整参数,将COG-OS与SMOTE+支持向量机(SVM)的预测性能相比较,实验结果表明该方法具有可行性。当采用K-means方法划分无故障盘样本,并采用径向基函数(RBF)内核的LIBSVM方法预测故障盘时,COG-OS改善了SMOTE+SVM对故障硬盘的预测查全率和整体性能。 The hard disk of cloud computing platform is not reliable. This paper proposed to use Self-Monitoring Analysis and Reporting Technology (SMART) log to predict hard disk failure based on Classification using lOcal clustering with Over- Sampling (COG-OS) framework. First, faultless hard disks were divided into multiple disjoint sample subsets by using DBScan or K-means clustering algorithm. And then these subsets and another sample set of faulty hard disks were mixed, and Synthetic Minority Over-sampling TEchnique (SMOTE) was used to make the overall sample set tend to balance. At last, faulty hard disks was predicted by using LIBSVM classification algorithm. The experimental results show that the method is feasible. COG-OS improves SMOTE + Support Vector Machine (SVM) on faulty hard disks' recall and overall performance, when using K-means method to divide samples of faultless hard disks and using LIBSVM method with Radial Basis Function (RBF) kernel to predict faulty hard disks.
出处 《计算机应用》 CSCD 北大核心 2014年第1期31-35,188,共6页 journal of Computer Applications
基金 国家863计划项目(2011AA01A202)
关键词 COG-OS框架 自我监测分析和报告技术 K-均值 少量样本合成过采样技术 LIBSVM 支持向量机 Classification using lOcal clusterinG with Over-Sampling (COG-OS) framework Self-Monitoring Analysis and Reporting Technology (SMART) K-means Synthetic Minority Over-sampling TEchnique (SMOTE) LIBSVM Support Vector Machine (SVM)
  • 相关文献

参考文献15

  • 1刘缙,朱家稷,张海勇.大规模云计算平台的技术挑战[J].程序员,2012(2). 被引量:3
  • 2KUBAT M,HOLTER C,MATWIN S. Machine learning for the detection of oil spills in satellite radar images[J].{H}Machine Learning,1998,(2/3):195-215.
  • 3PHUA C,ALAHAKOON D,LEE V. Minority report in fraud detection:classification of skewed data[J].SIGKDD Explorations,2004,(01):50-59.
  • 4PEREZ J M,MUGUERZA J,ARBELAITZ O. Consolidated tree classifier learning in a car insurance fraud detection domain with class imbalance[A].{H}Berlin:Springer-Verlag,2005.381-389.
  • 5COHEN G,HILERIO M,SAX H. Data imbalance in surveillance of nosocomial infections[A].{H}Berlin:Springer-Verlag,2003.109-117.
  • 6CHEN J X,CHENG T H,CHAN A L F. An application of classification analysis for skewed class distribution in therapeutic drug monitoring-the case of vancomycin[A].Piscataway,NJ:IEEE Press,2004.35-39.
  • 7RADIVOJAC P,KORAD U,SIVALINGAM K M. Learning from class-imbalanced data in wireless sensor networks[A].Piscataway,NJ:IEEE Press,2003.3030-3034.
  • 8YE Z F,LU B L. Learning imbalanced data sets with a min-max modular support vector machine[A].Piscataway,NJ:IEEE Press,2007.1673-1678.
  • 9林智勇,郝志峰,杨晓伟.不平衡数据分类的研究现状[J].计算机应用研究,2008,25(2):332-336. 被引量:46
  • 10JAPKOWICZ N,STEPHEN S. The class imbalance problem:a systematic study[J].Intelligent Data Analysis,2002,(05):203-231.

二级参考文献55

  • 1KUBAT M, HOLTE R C, MATWIN S. Machine learning for the detection of oil spills in satellite radar images[ J] . Machine Learning, 1998, 30 ( 2- 3) : 195 -215 .
  • 2PHUA C, ALAHAKOON D. Minority report in fraud detection: classication of skewed data[ J] . SIGKDD Exp lorations, 2004 , 6 ( 1 ) :50- 59 .
  • 3PEREZ J M, MUGUERZA J, ARBELAITZ O, et al. Consolidated tree classifier learning in a car insurance fraud detection domain with class imbalance[ C] / / Proc of the 3rd International Conference on Advances in Pattern Recognition( ICAPR’05) . 2005 : 381- 389.
  • 4CASTILLO M D del, SERRANO J I. A multistrategy approach for digital text categorization from imbalanced documents [ J] . SIGKDD Exploration s, 2004, 6 ( 1) : 70- 79 .
  • 5ZHENG Zhao-hui, WU X, SRIHARI R K. Feature selection for text categorization on imbalanced data [ J] . SIGKDD Explorat ions,2004, 6 ( 1) : 80 - 89.
  • 6COHEN G, HILARIO M, SAX H, et al. Data imbalance in surveillance of nosocomial infections[ C] / / Proc of the 4th International Symposium on Medical Data Analysis ( ISMDA’03 ) . Berlin: [ s. n. ] ,2003: 109-117 .
  • 7CHEN Jian-xun, CHENG T H, CHAN A L F, et al. An application of classification analysis for skewed class distribution in therapeutic drug monitoring the case of vancomycin[ C] / / Proc of Workshop on Medical Information Systems ( IDEAS-DH’04 ) . Beijing: [ s. n. ] ,2004: 35 - 39.
  • 8YOON K, KWEK S. An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics[ C] / / Proc of the 5th International Conference on Hybrid Intelligent Systems( HIS’05 ) . Rio de Janeiro: [ s. n. ] , 2005 : 303-308.
  • 9RADIVOJAC P, KORAD U, SIVALINGAM K M, et al. Learning from class-imbalanced data in wireless sensor networks[ C] / /Proc of Vehicular Technology Conference( VTC’03-Fall) . Orlando: [ s. n. ] ,2003: 3030- 3034 .
  • 10JAPKOWICZ N, STEPHEN S. The class imbalance problem: a systematic study[ J] . Intelligent Data Analysis, 2002, 6 ( 5 ) : 203-231.

共引文献47

同被引文献8

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部