期刊文献+

基于电子取证数据的内容分析技术和应用

Content Analysis Technique and Application on Digital Forensics
下载PDF
导出
摘要 电子取证数据中聊天内容的数据量最大,针对内容的研判分析是重点和难点。通过采用模板、语义分析、HMM-Viterbi模型对内容提取重要信息,并采用计算文本特征值和深度学习计算语义距离挖掘涉案关键词,并通过TextRank算法提取内容关键词和自动摘要,从而能够快速掌握大量的聊天内容中的主要内容关键信息,提高工作效率。 In the field of digital forensics,files of chat history have the largest data scale,where the difficulty and prio- rity is to analyze the content of those messages. Different templates, semantic analysis and HMM-Viterbi model were employed to extract the key ideas of texts. Meanwhile, utilization of eigenvalues of texts combined with deep learning in calculating semantic similarities was used to dig out the keywords of cases. Additionally, TextRank algorithm aids in drawing the key-words and auto abstract of individual message files. All three strategies together accelerate the process of understanding main ideas and key messages delivered by large scale of data, so highly improving the working efficiency.
作者 曾超 刘晓宇 林艺滨 温若辉 ZENG Chao LIU Xiao-yu LIN Yi-bin WEN Ruo-hui(Xiamen Meiya Pico Information Co. ,Ltd. , Xiamen 361008, China Cyber Security Department, Beijing 100006, China)
出处 《计算机科学》 CSCD 北大核心 2016年第B12期228-230,共3页 Computer Science
关键词 取证分析 语义分析 HMM-Viterbi TextRank 词云图 Digital forensics, Semantic analysis, HMM-Viterbi, TextRank, Word cloud
  • 相关文献

参考文献4

二级参考文献22

  • 1俞士汶.语法知识在语言信息处理研究中的作用[J].语言文字应用,1997(4):82-88. 被引量:17
  • 2孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 3罗智勇,宋柔.现代汉语自动分词中专名的一体化、快速识别方法[C]//Ji Dong-Hong.国际中文电脑学术会议,新加坡,2001:323-328.
  • 4Ji Heng, Luo Zhen-Shen. Inverse name frequency model and rules based on Chinese name identifying. In: Huang ChangNing, Zhang Pu ed.. Natural Language Understanding and Machine Translation. Beijing: Tsinghua University Press,2001, 123 - 128( in Chinese)(季姮,罗振声.基于反比概率模型和规则的中文姓名自动辨识系统.见:黄昌宁,张普编.自然语言理解与机器翻译.北京:清华大学出版社,2001,123-128)
  • 5Zhen Jia-Heng, Liu Kai-Ying. Discussion on strategy of surname and personal name processing in Chinese word segmentation. In: Chen Li-Wei ed.. Research and Application of Computational Linguistics. Beijing: Beijing Institute of Linguistics and Culture Press, 1993(in Chinese)(郑家恒刘开瑛.自动分词系统中姓氏人名的处理策略探讨.见:陈力为编.计算语言研究与应用.北京:北京语言学院出版社,1993)
  • 6Song Rou, Zhu Hong et al.. Approach of personal name recognition based on corpus and rules. In: Chen Li Wei ed.. Research and Application of Computational Linguistics. Beijing:Beijing Institute of Linguistics and Culture Press, 1993(in Chinese)(宋柔,朱宏等.基于语料库和规则库的人名识别法.见:陈力为编.计算语言研究与应用.北京:北京语言学院出版社,1993)
  • 7Wang Sheng, Huang De-Gen, Yang Yuan-Sheng. Chinese person name recognition based on mixture of statistics and rules.In: Huang Chang-Ning, Dong Zhen-Dong ed.. Corpora of Computational Linguistics. Beijing: Tsinghua University Press, 1999 (in Chinese)(王省,黄德根,杨元生.基于统计和规则相结合的中文姓名识别.见:黄昌宁,董振东编.计算语言学文集.北京:清华大学出版社,1999)
  • 8Chen Xiao-He. Automatic Analysis of Modern Chinese. Beijing: Beijing University Linguistics and Culture Press, 2000,104-114(in Chinese)(陈小荷.现代汉语自动分析.北京:北京语言文化大学出版社, 2000, 104-114 )
  • 9Rabiner L. R.. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of IEEE, 1989, 77(2): 257~286
  • 10Rabiner L. R. , Juang B. H. An introduction to hidden Markov models. IEEE Acoustics, Speech & Signal Processing Magazine, 1986, 3:4~166

共引文献213

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部