Improving Scalability of Cloud Monitoring Through PCA-Based Clustering of Virtual Machines 被引量：3

Improving Scalability of Cloud Monitoring Through PCA-Based Clustering of Virtual Machines

导出

摘要 Cloud computing has recently emerged as a leading paradigm to allow customers to run their applications in virtualized large-scale data centers. Existing solutions for monitoring and management of these infrastructures consider virtual machines （VMs） as independent entities with their own characteristics. However, these approaches suffer from scalability issues due to the increasing number of VMs in modern cloud data centers. We claim that scalability issues can bc addressed by leveraging the similarity among VMs behavior in terms of resource usage patterns. In this paper we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. The innovative contribution of the proposed methodology is the use of the statistical technique known as principal component analysis （PCA） to automatically select the most relevant information to cluster similar VMs. We apply the methodology to two case studies, a virtualized testbed and a real enterprise data center. In both case studies, the automatic data selection based on PCA allows us to achieve high performance, with a percentage of correctly clustered VMs between 80% and 100% even for short time series （1 day） of monitored data. Furthermore, we estimate the potential reduction in the amount of collected data to demonstrate how our proposal may address the scalability issues related to monitoring and management in cloud computing data centers. Cloud computing has recently emerged as a leading paradigm to allow customers to run their applications in virtualized large-scale data centers. Existing solutions for monitoring and management of these infrastructures consider virtual machines （VMs） as independent entities with their own characteristics. However, these approaches suffer from scalability issues due to the increasing number of VMs in modern cloud data centers. We claim that scalability issues can bc addressed by leveraging the similarity among VMs behavior in terms of resource usage patterns. In this paper we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. The innovative contribution of the proposed methodology is the use of the statistical technique known as principal component analysis （PCA） to automatically select the most relevant information to cluster similar VMs. We apply the methodology to two case studies, a virtualized testbed and a real enterprise data center. In both case studies, the automatic data selection based on PCA allows us to achieve high performance, with a percentage of correctly clustered VMs between 80% and 100% even for short time series （1 day） of monitored data. Furthermore, we estimate the potential reduction in the amount of collected data to demonstrate how our proposal may address the scalability issues related to monitoring and management in cloud computing data centers.

作者 Claudia Canali Riccardo Lancellotti

机构地区 IEEE Department of Information Engineering ACM

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2014年第1期38-52,共15页 计算机科学技术学报（英文版）

关键词 cloud computing resource monitoring principal component analysis k-means clustering cloud computing,resource monitoring,principal component analysis,k-means clustering

分类号 TP317 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献1

1Adel Nadhem Naeem,Sureswaran Ramadass,Chan Huah Yong.Controlling Scale Sensor Networks Data Quality in the Ganglia Grid Monitoring Tool[J].通讯和计算机（中英文版）,2010,7(11):18-26. 被引量：2

二级参考文献12

1V. Bruno, T. Pieter, D.L. Marc, D.T. Filip, D. Bart, D. Piet, A scalable and preferment grid monitoring and information framework, The 2005 International Multiconference in Computer Science & Computer Engineering, 2005.
2B. Tierney, R. Aydt, D. Gunter, W. Smith, M. Swany, V. Taylor, R. Wolski, A grid monitoring architecture, GGF Performance Working Group, 2002.
3D. Lee, J.J. Dongarra., R.S. Ramakrishna., VISPERF: monitoring tool for grid computing, computational science, ICCS 2003, 2003, pp. 692-692.
4A. Giovanni, C. Dario, E. Cosimo, A.G.P. Marra, A.G. Mastrantonio, A. Gianvito Quarta, Globus, Monitoring and discovery service and sensorML for grid sensor networks, In 15th IEEE International Workshops, 2006.
5T.K.H. Chen, Sensor-grid computing and sensorGrid architecture for event detection classification and decision-making, Sensor Networks and Configuration, Springer-Verlag, 2006.
6A. Giovanni, C. Dario, E. Cosimo, A.G.P. Marra, A.G. Mastrantonio, A. Gianvito Quarta, Globus, SensorML for Grid Sensor Networks, 2006.
7V. Hingne, A. Joshi, E. Houstis, J.A.M.J. Michopoulos, On the grid and sensor networks, In Proceedings Fourth International Workshop on, 2003.
8B.L. Hock, M.T. Yong, P. Mukherjee, A.T.L. Vinh, A. F.W. Weng, S.A. See, Sensor grid: integration of wireless sensor networks and the grid, In Local Computer Networks IEEE Conference on, 2005.
9B. Zoltan, K. Peter, P. Norbert, V. Ferenc, Comparison of representative grid monitoring tools, Computer andAutomation Research Institute of the Hungarian Academy of Sciences (2000).
10F. Steve, K. Markus, Ganglia native windows node agent, Technical report from APR Consulting, 2006.

共引文献1

1Adel Nadhem Naeem Ali Abdulqader Bin Salem Sureswaran Ramadass.Integration of Sensor Networks Monitoring in the Ganglia Scalable Grid Monitoring Tool[J].通讯和计算机（中英文版）,2011,8(8):668-673.

同被引文献23

1冯少冲,邸彦强,朱元昌,杨文兵.IaaS云计算中虚拟机部署算法研究[J].华中科技大学学报（自然科学版）,2012,40(S1):359-364. 被引量：4
2Aceto G,Botta A,de Donato W,et al.Cloud monitoring:a survey[J].Computer Networks,2013,57(9):2093-2115.
3Boulon J,Konwinski A,Qi R,et al.Chukwa,a large-scale monitoring system[C]∥Proceedings of the 24th International Conference on Large Installation System Administration.Chicago:ACM,2008:1-5.
4Litty L,Lagar-Cavilla H A,Lie D.Computer meteorology:monitoring compute clouds[C]∥Proceedings of the 12th Conference on Hot Topics in Operating Systems(HotOS'09).Berkeley:USENIX Association,2009:4-4.
5Park J S,Yu H C,Chung K S,et al.Markov chain based monitoring service for fault tolerance in mobile cloud computing[C]∥Proceedings of 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications.Biopolis:ACM,2011:520-525.
6Zou Deqing,Zhang Wenrong,Qiang Weizhong,et al.Design and implementation of a trusted monitoring framework for cloud platforms[J].Future Generation Computer Systems,2013,29(8):2092-2102.
7Tian Wenhong,Sun Xiashuan,Cao Jun,et al.CloudMoni:a monitoring framework for on demand virtual machine allocation in cloud data centers[J].Journal of Information and Computational Science,2013,10(4):4639-4646.
8Chen Huacai,Jin Hai,Hu Kan,et al.Scheduling overcommitted VM:behavior monitoring and dynamic switching-frequency scaling[J].Future Generation Computer Systems,2013,29(1):341-351.
9Shao Zhiyuan,He Ligang,Lu Zhiqiang,et al.VSA:an offline scheduling analyzer for Xen virtual machine monitor[J].Future Generation Computer Systems,2013,29(8):2067-2076.
10Brodsky B E,Darkhovsky B S.Nonparametric methods in change-point problems[M].Dordrecht:Kluwer Academic Publishers,1993.

引证文献3

1于明,张雨,刘畅,张丹丹.云环境下基于多属性信息熵的虚拟机异常检测[J].华中科技大学学报（自然科学版）,2015,43(5):63-67. 被引量：6
2Zuo-Ning Chen,Kang Chen,Jin-Lei Jiang,Lu-Fei Zhang,Song Wu,Zheng-Wei Qi,Chun-Ming Hu,Yong-Wei Wu,Yu-Zhong Sun,Hong Tang,Ao-Bing Sun,Zi-Lu Kang.Evolution of Cloud Operating System： From Technology to Ecosystem[J].Journal of Computer Science & Technology,2017,32(2):224-241.
3董萍.基于深度学习的云计算虚拟机分类算法[J].西南师范大学学报（自然科学版）,2021,46(5):110-114. 被引量：2

二级引证文献8

1王东,孙彬,张绍武.微信息进程与流量检测指令分布下的倾向性检测模型[J].云南大学学报（自然科学版）,2016,38(5):714-723. 被引量：1
2鲁明,宋馥莉.基于隐马尔可夫模型的虚拟机性能异常预测[J].河南农业大学学报,2016,50(4):563-567. 被引量：3
3赵忠明,孟瑜,岳安志,黄青青,孔赟珑,袁媛,刘晓奕,林蕾,张蒙蒙.遥感时间序列影像变化检测研究进展[J].遥感学报,2016,20(5):1110-1125. 被引量：31
4张健,蔡长亮,宫良一,顾兆军.基于KVM虚拟化环境的异常行为检测技术研究[J].信息网络安全,2017(11):1-6. 被引量：2
5张蕊,张桂发,郭记眀,蒋洪波.富属性异质信息网络的可约束异常检测[J].华中科技大学学报（自然科学版）,2017,45(12):26-31. 被引量：1
6相铮,石春鹏,韩立新.面向工业云平台的入侵检测技术研究[J].制造业自动化,2020,42(6):153-156. 被引量：1
7幸荔芸,王涛.基于云计算的分布式煤矿井下作业安全监控方法[J].煤炭技术,2023,42(6):137-140. 被引量：1
8谢辅雯,邹道生.基于贪婪算法的云计算数据块节能存储仿真[J].计算机仿真,2024,41(2):522-526.

1Xiaolin WANG,Taowei LUO,Jingyuan HU,Zhenlin WANG,Yingwei LUO.Evaluating the impacts of hugepage on virtual machines[J].Science China(Information Sciences),2017,60(1):31-46.
2Zi-yang LI,Yi-ming ZHANG,Dong-sheng LI,Peng-fei ZHANG,Xi-cheng LU.VirtMan:design and implementation of a fast booting system for homogeneous virtual machines in iVCE[J].Frontiers of Information Technology & Electronic Engineering,2016,17(2):110-121.
3Zhaoning ZHANG,Dongsheng LI,Kui WU.Large-scale virtual machines provisioning in clouds： challenges and approaches[J].Frontiers of Computer Science,2016,10(1):2-18. 被引量：3
42010年度数据中心市场现状及五年展望[J].电源世界,2011(3):5-6. 被引量：1
5LIAO XiaoFei LI He JIN Hai HOU HaiXiang JIANG Yue & LIU HaiKun.VMStore: Distributed storage system for multiple virtual machines[J].Science China(Information Sciences),2011,54(6):1104-1118. 被引量：3
6病毒仿冒国内某知名应用市场[J].电脑爱好者,2017,0(6):47-47.
7李楷平.美国版ACE来了,RIOT将如何应对?[J].电子竞技,2016,0(19):44-45.
8Jian Chen Jian Yin Jin Huang Liangyi Ou.Mining Cross-Transaction Web Usage Patterns[J].通讯和计算机（中英文版）,2005,2(5):6-11.
9CHEN HaoGang,WANG XiaoLin,WANG ZhenLin,ZHANG BinBin,LUO YingWei,LI XiaoMing.DMM:A dynamic memory mapping model for virtual machines[J].Science China(Information Sciences),2010,53(6):1097-1108. 被引量：11
10Chuang LIN,Min YAO,Yin LI.Joint study on VMs deployment, assignment and migration in geographically distributed data centers[J].Frontiers of Computer Science,2016,10(3):559-573.

Journal of Computer Science & Technology

2014年第1期

浏览历史

内容加载中请稍等...