摘要
【目的】梳理数据驱动下,面向主动式威胁狩猎的语言模型研究进展,为高级威胁的检测、溯源提供技术前瞻。【方法】结合安全前沿学术与工业进展,从威胁狩猎的评估指标构建,多模多维多源数据的融合、依赖爆炸缓解及分析,和多模态威胁狩猎语言的建模等多个层次分别介绍总结相关研究。【结果】结合威胁狩猎的关键需求与相关技术趋势,从支持的数据类型、模式类型、建模方法、实时性等维度,全面总结了数据驱动威胁狩猎与威胁狩猎语言模型的研究现状与研究趋势。【结论】面对高对抗性APT等威胁检测取证场景,一方面需要构建多源异构的融合数据基础设施,并解决数据的依赖爆炸问题;另一方面,仍需要探索标准化、灵活的语言模型,来支持多模态、多源、多维数据的统一分析。
[Objective]This paper summarizes the research progress of language models for proactive threat hunting driven by data,and provides technical foresight for the detection and source tracing of advanced threats.[Methods]This study combines the frontier academic and industrial progress of cyber security,introduces and summarizes related research from multiple levels such as the evaluation metric construction for threat hunting,the fusion of multi-modal,multi-dimensional,and multi-source data,the dependency explosion mitigation and analysis,and finally the modeling of multi-modal threat hunting language.[Results]Combining the key requirements of threat hunting and related technology trends,the supported data types,model types,modeling methods,timeliness,and other dimensions of the research status and future works of data-driven threat hunting and language models are comprehensively summarized.[Conclusions]In the face of threat detection and forensic scenarios for adversarial APT attacks,on one hand,it is necessary to build a multi-source heterogeneous fusion data infrastructure and solve the problem of data dependence explosion.On the other hand,it is still necessary to explore standardized and flexible language models to support the unified analysis of multi-modal,multi-source,and multi-dimensional data.
作者
张润滋
康彬
ZHANG Runzi;KANG Bin(NSFOCUS Information Technology Co.,Ltd.,Beijing 100089,China;Unit 96941 of PLA,Beijing 100085,China)
出处
《数据与计算发展前沿》
CSCD
2022年第5期98-107,共10页
Frontiers of Data & Computing
基金
中国博士后科学基金资助项目(2020M670181)。
关键词
威胁狩猎
安全运营
高级持续性威胁
threat hunting
security operations
advanced persistent threats