Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution...Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution. But the real-world malware data distribution is not stable and changes with time. By exploiting the knowledge of the machine learning algorithm and malware data concept drift problem, we show a novel learning evasive botnet architecture and a stealthy and secure C&C mechanism. Based on the email communication channel, we construct a stealthy email-based P2 P-like botnet that exploit the excellent reputation of email servers and a huge amount of benign email communication in the same channel. The experiment results show horizontal correlation learning algorithm is difficult to separate malicious email traffic from normal email traffic based on the volume features and time-related features with enough confidence. We discuss the malware data concept drift and possible defense strategies.展开更多
基金the National Key Basic Research Program of China (Grant: 2013CB834204)the National Natural Science Foundation of China (Grant: 61300242, 61772291)+1 种基金the Tianjin Research Program of Application Foundation and Advanced Technology (Grant: 15JCQNJC41500, 17JCZDJC30500)the Open Project Foundation of Information Security Evaluation Center of Civil Aviation, Civil Aviation University of China (Grant: CAAC-ISECCA- 201701, CAAC-ISECCA-201702)
文摘Nowadays, machine learning is widely used in malware detection system as a core component. The machine learning algorithm is designed under the assumption that all datasets follow the same underlying data distribution. But the real-world malware data distribution is not stable and changes with time. By exploiting the knowledge of the machine learning algorithm and malware data concept drift problem, we show a novel learning evasive botnet architecture and a stealthy and secure C&C mechanism. Based on the email communication channel, we construct a stealthy email-based P2 P-like botnet that exploit the excellent reputation of email servers and a huge amount of benign email communication in the same channel. The experiment results show horizontal correlation learning algorithm is difficult to separate malicious email traffic from normal email traffic based on the volume features and time-related features with enough confidence. We discuss the malware data concept drift and possible defense strategies.