摘要
[研究目的]在GTD海量恐怖主义活动数据中,存在特征影响恐怖袭击目标实现,使用机器学习方法建立恐怖袭击预警模型,可为反恐预警提供决策支持。[研究方法]通过预测恐怖袭击目标实现风险,可发现恐袭预警重要性特征。对135维GTD特征进行特征筛选、归一化、独热编码、卡方检验、PCA降维,利用Lightgbm算法在内四种机器学习算法测试评估,并根据Lightgbm算法特征重要性,控制变量并重复实验得到重点及突增点特征。[研究结果]综合评价指标,证明Lightgbm算法在表现上优于其他机器学习算法,在保证样本类别1∶1均衡的前提下,910次实验其平均准确率为0.7986,召回率为0.7852,f1值为0.7832,平均运行时间为3.57s。Lighgbm算法在GTD数据上可有效提高分类效果,attacktype突增点特征及前十四顺位特征应作为重点特征辅助预警决策。
[Research purpose]In GTD's massive terrorist activity data,there are features affect the realization of terrorist attack targets.Using machine learning method to establish early warning model of terrorist attack can provide decision-making support for anti-terrorism early warning.[Research method]This study predicts the realization risk of terrorist attack targets.So the importance of early warning of terrorist attacks can be found.It performs feature filtering,normalization,one-hot encoding,chi-square test,PCA dimensionality reduction on 135-dimensional GTD features,and uses four machine learning algorithms including the Lightgbm algorithm to evaluate the results.According to the feature importance of lightgbm algorithm,we control the variables and repeat the experiment to get the key features and sudden increase feature.[Research conclusion]The comprehensive evaluation index proves that lightgbm algorithm is superior to other machine learning algorithms in performance.Under the premise of ensuring 1:1 balance of sample categories,the average accuracy of 910 experiments is 0.7986,the recall rate is 0.7852,the f1 value is 0.7832,and the average running time is 3.57s.The lighgbm algorithm can effectively improve the classification effect on GTD data.The sudden increase point feature“attacktype”and the top fourteen ranking features should be used as key features to assist early warning decision-making.
作者
陈晨
李勇男
王铭戬
Chen Chen;Li Yongnan;Wang Mingjian(School of National Security,People's Public Security University of China,Beijing 100038;Investigation College,People's Public Security University of China,Beijing 100038)
出处
《情报杂志》
CSSCI
北大核心
2022年第6期21-28,98,共9页
Journal of Intelligence
基金
北京市社会科学基金项目“大数据背景下北京反恐风险特征识别及应急防范机制研究”(编号:21GLC064)
中国人民公安大学基本科研业务费重点项目“基于非频繁模式挖掘的反恐情报关联分析方法研究”(编号:2021JKF211)
2021年公安学科基础理论研究专项项目“大数据背景下的反恐风险预警基础理论研究”(编号:2021XKZX07)
中国人民公安大学公共安全风险防控教育部工程研究中心学生科研项目“基于全球恐怖主义数据库的模型构建及数据挖掘研究”(编号:GCZX202101-1)。