摘要
僵尸网络是指采用一种或多种传播手段,将大量主机感染僵尸病毒,从而在主控者和被感染主机之间,通过命令控制服务器,形成一个一对多控制的网络。攻击者操纵僵尸网络通常会使用多个域名来连接至C2服务器,达到操控受害者主机的目的。这些域名一般被硬编码在恶意程序中,使得攻击者能便捷地更改这些域名。为了躲避封禁,这些域名通常由域生成算法(Domain Generation Algorithms,DGA)生成。针对普遍的机器学习检测DGA域名方式存在样本不充分及通用型不强的问题,文章在研究了大量DGA域名生成算法的基础上进一步完善黑白样本,利用文本分析的手段结合GaussianHMM、LSTM、BernoulliNB模型提取具备普遍区分能力的特征,构建一个具备低风险结构的通用DGA检测集成学习方法。
Botnet refers to the use of one or more means of transmission,which will infect varieties of servers with zombie virus,therefore could result in a potential one to many control network between the controller and the infected servers.In order to gain the control of the infected servers,establishment of connections from multiple domains to C2 server would normally be used upon virus network.These domains could be programmed into codes,which could be easily changed by the hackers.To avoid being banned,these domains are normally generated by using Domain Generation Algorithms(DGA).Actually,some studies showed machine learning methods to cope with the issue mentioned.However,these methods does have issues such as insufficient samples and non-universal.This paper focus on the improvements of WriteBlack Sampling based on the fundamental of DGA by using text analysis in combination with GaussianHMM,LSTM,BernoulliNB.These models could effectively subtract the key features,therefore construct a low risk structured universally used DGA ensemble machine learning model.
作者
刘浩杰
皇甫道一
李岩
王涛
Liu Haojie;Huang Fudaoyi;Li Yan;Wang Tao(Suning technology group,Jiangsu Nanjing 210000)
出处
《网络空间安全》
2019年第9期26-32,共7页
Cyberspace Security
关键词
僵尸网络
恶意域名
域名生成算法
集成学习
botnet
malicious domain
domain generation algorithms
ensemble learning