In the past two decades, software aging has been studied by both academic and industry communities. Many scholars focused on analytical methods or time series to model software aging process. While machine learning ha...In the past two decades, software aging has been studied by both academic and industry communities. Many scholars focused on analytical methods or time series to model software aging process. While machine learning has been shown as a very promising technique in application to forecast software state: normal or aging. In this paper, we proposed a method which can give practice guide to forecast software aging using machine learning algorithm. Firstly, we collected data from a running commercial web server and preprocessed these data. Secondly, feature selection algorithm was applied to find a subset of model parameters set. Thirdly, time series model was used to predict values of selected parameters in advance. Fourthly, some machine learning algorithms were used to model software aging process and to predict software aging. Fifthly, we used sensitivity analysis to analyze how heavily outcomes changed following input variables change. In the last, we applied our method to an IIS web server. Through analysis of the experiment results, we find that our proposed method can predict software aging in the early stage of system development life cycle.展开更多
In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to hi...In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to high volume data feeds that are common in large Tier-1 ISP networks and providing rich, timely information on observed attacks. It is a software solution that is designed to run on off-the-shelf hardware platforms and incorporates a scalable data processing architecture along with lightweight analysis algorithms that make it suitable for deployment in large networks. RTFF also makes use of state of the art machine learning algorithms to construct attack models that can be used to detect as well as predict attacks.展开更多
Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution...Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution. But the real-world malware data distribution is not stable and changes with time. By exploiting the knowledge of the machine learning algorithm and malware data concept drift problem, we show a novel learning evasive botnet architecture and a stealthy and secure C&C mechanism. Based on the email communication channel, we construct a stealthy email-based P2 P-like botnet that exploit the excellent reputation of email servers and a huge amount of benign email communication in the same channel. The experiment results show horizontal correlation learning algorithm is difficult to separate malicious email traffic from normal email traffic based on the volume features and time-related features with enough confidence. We discuss the malware data concept drift and possible defense strategies.展开更多
System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software m...System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software metrics that are input to the models. In this study, we consider 20 types of metrics to develop a model using an extreme learning machine associated with various kernel methods. We evaluate the effectiveness of the mode using a proposed framework based on the cost and efficiency in the testing phases. The evaluation process is carried out by considering case studies for 30 object-oriented software systems. Experimental results demonstrate that the application of a fault prediction model is suitable for projects with the percentage of faulty classes below a certain threshold, which depends on the efficiency of fault identification(low: 47.28%; median: 39.24%; high: 25.72%). We consider nine feature selection techniques to remove the irrelevant metrics and to select the best set of source code metrics for fault prediction.展开更多
基金supported by the grants from Natural Science Foundation of China(Project No.61375045)the joint astronomic fund of the national natural science foundation of China and Chinese Academic Sinica(Project No.U1531242)Beijing Natural Science Foundation(4142030)
文摘In the past two decades, software aging has been studied by both academic and industry communities. Many scholars focused on analytical methods or time series to model software aging process. While machine learning has been shown as a very promising technique in application to forecast software state: normal or aging. In this paper, we proposed a method which can give practice guide to forecast software aging using machine learning algorithm. Firstly, we collected data from a running commercial web server and preprocessed these data. Secondly, feature selection algorithm was applied to find a subset of model parameters set. Thirdly, time series model was used to predict values of selected parameters in advance. Fourthly, some machine learning algorithms were used to model software aging process and to predict software aging. Fifthly, we used sensitivity analysis to analyze how heavily outcomes changed following input variables change. In the last, we applied our method to an IIS web server. Through analysis of the experiment results, we find that our proposed method can predict software aging in the early stage of system development life cycle.
文摘In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to high volume data feeds that are common in large Tier-1 ISP networks and providing rich, timely information on observed attacks. It is a software solution that is designed to run on off-the-shelf hardware platforms and incorporates a scalable data processing architecture along with lightweight analysis algorithms that make it suitable for deployment in large networks. RTFF also makes use of state of the art machine learning algorithms to construct attack models that can be used to detect as well as predict attacks.
基金the National Key Basic Research Program of China (Grant: 2013CB834204)the National Natural Science Foundation of China (Grant: 61300242, 61772291)+1 种基金the Tianjin Research Program of Application Foundation and Advanced Technology (Grant: 15JCQNJC41500, 17JCZDJC30500)the Open Project Foundation of Information Security Evaluation Center of Civil Aviation, Civil Aviation University of China (Grant: CAAC-ISECCA- 201701, CAAC-ISECCA-201702)
文摘Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution. But the real-world malware data distribution is not stable and changes with time. By exploiting the knowledge of the machine learning algorithm and malware data concept drift problem, we show a novel learning evasive botnet architecture and a stealthy and secure C&C mechanism. Based on the email communication channel, we construct a stealthy email-based P2 P-like botnet that exploit the excellent reputation of email servers and a huge amount of benign email communication in the same channel. The experiment results show horizontal correlation learning algorithm is difficult to separate malicious email traffic from normal email traffic based on the volume features and time-related features with enough confidence. We discuss the malware data concept drift and possible defense strategies.
基金the FIST project,of DST, government of India for sponsoring the work on web engineering and cloud based computing
文摘System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software metrics that are input to the models. In this study, we consider 20 types of metrics to develop a model using an extreme learning machine associated with various kernel methods. We evaluate the effectiveness of the mode using a proposed framework based on the cost and efficiency in the testing phases. The evaluation process is carried out by considering case studies for 30 object-oriented software systems. Experimental results demonstrate that the application of a fault prediction model is suitable for projects with the percentage of faulty classes below a certain threshold, which depends on the efficiency of fault identification(low: 47.28%; median: 39.24%; high: 25.72%). We consider nine feature selection techniques to remove the irrelevant metrics and to select the best set of source code metrics for fault prediction.