In order to achieve failure prediction without manual intervention for distributed systems, a novel failure feature analysis and extraction approach to automate failure prediction is proposed. Compared with the tradit...In order to achieve failure prediction without manual intervention for distributed systems, a novel failure feature analysis and extraction approach to automate failure prediction is proposed. Compared with the traditional methods which focus on building heuristic rules or models, the autonomic prediction approach analyzes the nonlinear correlation of failure features by recognizing failure patterns. Failure data are sorted according to the nonlinear correlation and failure signature is proposed for autonomic prediction. In addition, the Manifold Learning algorithm named supervised locally linear embedding is applied to achieve feature extraction. Based on the runtime monitoring of failure metrics, the experimental results indicate that the proposed method has better performance in terms of both correlation recognition precision and feature extraction quality and thus it can be used to design efficient autonomic failure prediction for distributed systems.展开更多
Storing and querying XML (eXtensible Markup Language) data in relational form can exploit various services offered by modern relational database management systems (RDBMSs). Due to structural complexity of XML, there ...Storing and querying XML (eXtensible Markup Language) data in relational form can exploit various services offered by modern relational database management systems (RDBMSs). Due to structural complexity of XML, there are many equivalent relational mapping schemes for the same XML data and queries. In this paper, we propose the adaptive XML to relational mapping (AX2RM) system, which considers finding optimal XML to relational (X2R) mapping as four separate but correlated procedures: logical database design, data scale estimation, workload transformation, and physical database design. We view the whole process as an autonomic computing problem and formalize the adaptive X2R mapping problem. Search spaces for each procedure are investigated individually, and five approaches for finding the optimal mapping are studied. We propose an integrated approach with greedy pruning (IT-GP), which views the mapping procedures as a whole and exploits heuristic rules in each procedure to prune impossible mappings as early as possible. Evaluation of these approaches shows the validity and high efficiency of IT-GP.展开更多
This article investigates autonomic failure prediction in large-scale distributed systems with nonlinear dimensionality reduction to automatically extract failure features. Most existing methods for failure prediction...This article investigates autonomic failure prediction in large-scale distributed systems with nonlinear dimensionality reduction to automatically extract failure features. Most existing methods for failure prediction focus on building prediction models or heuristic rules by discovering failure patterns, but the process of feature extraction before failure patterns recognition is rarely considered due to the increasing complexity of modern distributed systems. In this work, a novel performance-centric approach to automate failure prediction is proposed based on manifold learning (ML). In addition, the ML algorithm named supervised locally linear embedding (SLLE) is applied to achieve feature extraction. To generalize the dimensionality reduction mapping, the nonlinear mapping approximation and optimization solution is also proposed. In experimental work a file transfer test bed with fault injection is developed which can gather multilevel performance metrics transparently. Based on the runtime monitoring of these metrics, the SLLE method can automatically predict more than 50% of the central processing unit (CPU) and memory failures, and around 70% of the network failure.展开更多
Modern datacenter servers hosting popular Internet services face significant and multi-facet challenges in performance and power control. The user-perceived performance is the result of a complex interaction of comple...Modern datacenter servers hosting popular Internet services face significant and multi-facet challenges in performance and power control. The user-perceived performance is the result of a complex interaction of complex workloads in a very complex underlying system. Highly dynamic and bursty workloads of Internet services fluctuate over multiple time scales, which has a significant impact on processing and power demands of datacenter servers. High-density servers apply virtualization technology for capacity planning and system manageability. Such virtuMized computer systems are increasingly large and complex. This paper surveys representative approaches to autonomic performance and power control on virtualized servers, which control the quality of service provided by virtualized resources, improve the energy efficiency of the underlying system, and reduce the burden of complex system management from human operators. It then presents three designed self-adaptive resource management techniques based on machine learning and control for percentile-based response time assurance, non-intrusive energy-efficient performance isolation, and joint performance and power guarantee on virtualized servers. The techniques were implemented and evaluated in a testbed of virtualized servers hosting benchmark applications. Finally, two research trends are identified and discussed for sustainable cloud computing in green datacenters.展开更多
Since Service-Oriented Architecture (SOA) reveals the black box nature of services,heterogeneity,service dynamism,and service evolvability,managing services is known to be a challenging problem.Autonomic computing (AC...Since Service-Oriented Architecture (SOA) reveals the black box nature of services,heterogeneity,service dynamism,and service evolvability,managing services is known to be a challenging problem.Autonomic computing (AC) is a way of designing systems that can manage themselves without direct human intervention.Hence,applying the key disciplines of AC to service management is appealing.A key task of service management is to identify probable causes for symptoms detected and to devise actuation methods that can remedy the causes.In SOA,there are a number of target elements for service remedies,and there can be a number of causes associated with each target element.However,there is not yet a comprehensive taxonomy of causes that is widely accepted.The lack of cause taxonomy results in the limited possibility of remedying the problems in an autonomic way.In this paper,we first present a meta-model,extract all target elements for service fault management,and present a computing model for autonomously managing service faults.Then we define fault taxonomy for each target element and inter-relationships among the elements.Finally,we show prototype implementation using cause taxonomy and conduct experiments with the prototype for validating its applicability and effectiveness.展开更多
基金Supported by the National High Technology Research and Development Programme of China ( No. 2007AA01Z401 ) and the National Natural Science Foundation of China (No. 90718003, 60973027).
文摘In order to achieve failure prediction without manual intervention for distributed systems, a novel failure feature analysis and extraction approach to automate failure prediction is proposed. Compared with the traditional methods which focus on building heuristic rules or models, the autonomic prediction approach analyzes the nonlinear correlation of failure features by recognizing failure patterns. Failure data are sorted according to the nonlinear correlation and failure signature is proposed for autonomic prediction. In addition, the Manifold Learning algorithm named supervised locally linear embedding is applied to achieve feature extraction. Based on the runtime monitoring of failure metrics, the experimental results indicate that the proposed method has better performance in terms of both correlation recognition precision and feature extraction quality and thus it can be used to design efficient autonomic failure prediction for distributed systems.
基金the National Natural Science Foundation of China (No. 60603044)the China Postdoctoral Science Foundation (No. 20070411179)the Program for Changjiang Scholars and Innovative Research Team in University of China (No. IRT0652)
文摘Storing and querying XML (eXtensible Markup Language) data in relational form can exploit various services offered by modern relational database management systems (RDBMSs). Due to structural complexity of XML, there are many equivalent relational mapping schemes for the same XML data and queries. In this paper, we propose the adaptive XML to relational mapping (AX2RM) system, which considers finding optimal XML to relational (X2R) mapping as four separate but correlated procedures: logical database design, data scale estimation, workload transformation, and physical database design. We view the whole process as an autonomic computing problem and formalize the adaptive X2R mapping problem. Search spaces for each procedure are investigated individually, and five approaches for finding the optimal mapping are studied. We propose an integrated approach with greedy pruning (IT-GP), which views the mapping procedures as a whole and exploits heuristic rules in each procedure to prune impossible mappings as early as possible. Evaluation of these approaches shows the validity and high efficiency of IT-GP.
基金Acknowledgements This work was supported by the Hi-Tech Research and Development Program of China (2007AA01Z401), the National Natural Science Foundation of China (90718003, 60973027).
文摘This article investigates autonomic failure prediction in large-scale distributed systems with nonlinear dimensionality reduction to automatically extract failure features. Most existing methods for failure prediction focus on building prediction models or heuristic rules by discovering failure patterns, but the process of feature extraction before failure patterns recognition is rarely considered due to the increasing complexity of modern distributed systems. In this work, a novel performance-centric approach to automate failure prediction is proposed based on manifold learning (ML). In addition, the ML algorithm named supervised locally linear embedding (SLLE) is applied to achieve feature extraction. To generalize the dimensionality reduction mapping, the nonlinear mapping approximation and optimization solution is also proposed. In experimental work a file transfer test bed with fault injection is developed which can gather multilevel performance metrics transparently. Based on the runtime monitoring of these metrics, the SLLE method can automatically predict more than 50% of the central processing unit (CPU) and memory failures, and around 70% of the network failure.
基金supported in part by the National Science Foundation of USA under Grant Nos.CNS-0844983(CAREER Award)and CNS-1217979the National Natural Science Foundation of China under Grant No.61328203
文摘Modern datacenter servers hosting popular Internet services face significant and multi-facet challenges in performance and power control. The user-perceived performance is the result of a complex interaction of complex workloads in a very complex underlying system. Highly dynamic and bursty workloads of Internet services fluctuate over multiple time scales, which has a significant impact on processing and power demands of datacenter servers. High-density servers apply virtualization technology for capacity planning and system manageability. Such virtuMized computer systems are increasingly large and complex. This paper surveys representative approaches to autonomic performance and power control on virtualized servers, which control the quality of service provided by virtualized resources, improve the energy efficiency of the underlying system, and reduce the burden of complex system management from human operators. It then presents three designed self-adaptive resource management techniques based on machine learning and control for percentile-based response time assurance, non-intrusive energy-efficient performance isolation, and joint performance and power guarantee on virtualized servers. The techniques were implemented and evaluated in a testbed of virtualized servers hosting benchmark applications. Finally, two research trends are identified and discussed for sustainable cloud computing in green datacenters.
基金Project (No.2011-0002534) supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education,Science and Technology
文摘Since Service-Oriented Architecture (SOA) reveals the black box nature of services,heterogeneity,service dynamism,and service evolvability,managing services is known to be a challenging problem.Autonomic computing (AC) is a way of designing systems that can manage themselves without direct human intervention.Hence,applying the key disciplines of AC to service management is appealing.A key task of service management is to identify probable causes for symptoms detected and to devise actuation methods that can remedy the causes.In SOA,there are a number of target elements for service remedies,and there can be a number of causes associated with each target element.However,there is not yet a comprehensive taxonomy of causes that is widely accepted.The lack of cause taxonomy results in the limited possibility of remedying the problems in an autonomic way.In this paper,we first present a meta-model,extract all target elements for service fault management,and present a computing model for autonomously managing service faults.Then we define fault taxonomy for each target element and inter-relationships among the elements.Finally,we show prototype implementation using cause taxonomy and conduct experiments with the prototype for validating its applicability and effectiveness.