摘要
在计算机上对网络浏览证据准确提取,可以协助追查泄露信息。首先应对网络浏览文本数据进行分类,挖掘分类后的频繁项集,构成满足阈值要求的网络浏览文本痕迹记录并进行提取,最终实现对网络浏览证据准确提取,但是传统方法通过计算两点距离,采用平方和直接比较结果完成提取,但是不能对网络浏览文本数据进行分类,无法挖掘其中的频繁项集,导致不能获取相关阈值对网络浏览证据提取,提出一种数据挖掘的计算机上对网络浏览证据准确提取方法。首先,运用朴素贝叶斯算法对网络浏览文本数据进行预测并将其归类,对每个训练样本数据进行计算,并求解出其分类到某个类别的概率,对比预测结果后将样本归并到高级别概率中,实现文本数据的归类;然后,利用频繁序列模式挖掘算法,对归类后的文本数据时间序列结构进行分析,通过采用可信度与支持度阈值规则,从文本数据时间序列中找到频繁序列,构成满足阈值要求的网络浏览文本痕迹记录并进行提取,最终实现对网络浏览证据准确提取。仿真结果表明,文中数据挖掘算法与计算机痕迹提取技术相结合,有效地提高了提取网络浏览证据的效率。
In this paper, we propose an accurate extraction method of network browse evidence on computer based on data mining. Firstly, the research predicted text data of network browse and carried out classification for it using naive Bayes algorithm, then calculated every training sample data and solved out probability of its classification to some category. The research merged the samples to high-grade probability after comparing the prediction results to a- chieve classification of text data. Moreover, the research used algorithm of frequent sequence pattern mining to ana- lyze classified time series structure of text data and found out frequent sequence from time series of text data through using threshold value rule of reliability degree and support degree. The research built text trace record of network browse satisfying with threshold value requirement to carry out extraction. Simulation results show that it can improve the extraction efficiency of network browse evidence to integrate text data-mining algorithm with the extraction tech- nology of computer trace.
出处
《计算机仿真》
北大核心
2017年第7期240-243,共4页
Computer Simulation
关键词
网络浏览证据
数据挖掘算法
痕迹提取
Network browsing evidence
Data mining algorithm
Trace extraction