摘要
开源情报是全源情报体系中重要的组成部分,随着移动通信技术和互联网的发展,大量的开源数据成为了越来越重要的情报信息来源。论文针对开源情报数据量大、数据类型多样、处理时效性要求高的特点,提出了一种基于Ha⁃doop大数据组件生态的开源情报处理框架,将框架分为基础架构、数据搜集、数据处理和开源情报信息应用四层,基于该框架对现有基于TF-IDF权值的开源情报特征提取算法进行了改进,实现了海量开源情报数据特征的自动化提取,通过数据实验验证了该算法的有效性,与现有开源情报特征提取算法相比,改进后的算法在处理大规模开源情报数据时效率明显提高。
Open source intelligence is an important part of the all source intelligence system.With the development of mobile communication technology and Internet,a large number of open source data has become an increasingly important source of intelli⁃gence.An open source intelligence processing framework based on Hadoop big data component ecology is proposed in this paper.The framework is divided into four layers:infrastructure,data collection,data processing and application of open source informa⁃tion.Based on the framework,the existing open source intelligence feature extraction algorithm based on TF-IDF weights is im⁃proved,and the automatic extraction of features of massive open source intelligence data is realized.The effectiveness of the algo⁃rithm is verified by data experiments.The data experiment shows that compared with the existing algorithm,the improved algorithm has significantly improved the efficiency of processing large-scale open source intelligence data。
作者
张猛
ZHANG Meng(Equipment Finance Division of Naval Equipment Integrated Planning Bureau,Beijing 100089)
出处
《舰船电子工程》
2020年第9期36-40,共5页
Ship Electronic Engineering