期刊文献+

基于集成学习的中文文本欺骗检测研究 被引量:7

Chinese Text Deception Detection Based on Ensemble Learning
下载PDF
导出
摘要 欺骗信息检测是信息安全领域中的重要研究内容.现有的研究表明,三分之一的人际交往中会涉及到潜在的欺骗,大量的欺骗信息充斥在各种各样的通信媒介中,在海量的网络信息中欺骗性数据的规模通常远小于非欺骗性数据的规模,已有方法还不能很好地适应于准确高效地欺骗检测,迫切期望提出一种能高效地检测欺骗信息的方法.针对具有非平衡性的海量网络信息,提出了一种基于集成学习的欺骗行为检测方法.通过改进的二分k-means划分方法对训练样本集进行分解,分别在每对正负样本集上学习各自独立的分类器,然后利用每个独立分类器分别计算待测样本的类别输出值,并采用结合个体分类器分类正确率的最小最大模块化方法集成每个判别结果.实验结果验证了该方法的有效性. Deception detection is important in the field of information security. Existing researches show that one third of the interpersonal communication involves the potential deceptions, and there are large amounts of deceptive messages in the more and more Web information. If the deception is potentially dangerous to people's life, the survival of enterprise and the stability of the country, then the negligence of deception may lead to incalculable loss. In the massive amounts of information the scale of the non-deceptive texts is much larger than the scale of the deceptive texts, so people remain unsuccessful and inefficient in detecting those deceptive messages by the existing methods, and it is desirable to create an automated method which could help people flag the possible deceptive messages. In this paper, we built a deception detection model based on ensemble learning to solve the imbalance of the existing data sets. Firstly a novel bisecting k-means method is proposed to cut the training sample set, and the separate classifiers are trained by using each pair of positive and negative samples, and then each test sample category value is calculated by the classifiers, and finally a novel min-max modular approach is used to integrate each category result. Experimental results verify the effectiveness of this method.
出处 《计算机研究与发展》 EI CSCD 北大核心 2015年第5期1005-1013,共9页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61005053 61100138 61373082 61322211) 国家"八六三"高技术研究发展计划基金项目(2015AA015407) 新世纪优秀人才支持计划基金项目(20121401110013) 山西省回国留学人员科研资助项目(2013-022) 山西省高等学校科技创新项目(2015104) 中国民航大学信息安全评测中心开放课题基金项目(CAAC-ISECCA-201402)
关键词 欺骗 欺骗检测 集成学习 样本划分 最小最大模块化支持向量机 deception deception detection ensemble learning cutting samples min-max modular support vector machine (M3-SVM)
  • 相关文献

参考文献31

  • 1Buller D, Burgoon J. Strategic Interpersonal Communication [M]. Mahwah, NJ : Lawrence Erlbaum Associates Publishers, 1994:191-223.
  • 2Daft R, LengeI R. Organizational information requirements, media richness, and structural design [J]. Management Science, 1986, 32(5): 554-570.
  • 3Short J, Williams E, Christie B. The Social Psychology of Telecommunications [M]. New York: Wiley Publisher, 1976.
  • 4Carlson J, Zmud R. Channel expansion theory and the experiential nature of media richness perceptions [J]. Academy of Management Journal, 1999, 42 (2) : 153-170.
  • 5Buller D, Burgoon J. Interpersonal deception theory [J]. Communication Theory, 1996, 6 (3): 203-242.
  • 6Blair J, Burgoon J, Strom 19. Heuristics and modalities in determining truth versus deception [C] //Proc of the 38th Annual Hawaii Int Conf on System Sciences. Los Alamitos, CA: IEEE Computer Society, 2005:19-25.
  • 7George J, Marett K, Tilley P. Deception detection under varying electronic media and warning conditions [C] //Proc of the 37th Annual Hawaii Int Conf on System Sciences. Los Alamitos, CA: IEEE Computer Society, 2004:327-336.
  • 8George J, Marett K. Inhibiting detection and its detection [C] //Proc of the 37th Annual Hawaii Int Conf on System Sciences. Los Alamitos, CA: IEEE Computer Society, 2004:337-346.
  • 9Zhou L, Sung Y. Cues to deception in online Chinese groups [C]//Proc of the 41st Annual Hawaii Int Conf on System Sciences. Los Alamitos, CA: IEEE Computer Society, 2008:146-153.
  • 10Qin T, Burgoon J, Nunamaker J, et al. An exploratory study on promising cues in deception detection and application of decision tree [C] //Proc of the 37th Annual Hawaii Int Conf on System Sciences. Los Alamitos, CA: IEEE Computer Society, 2004:357-366.

二级参考文献123

  • 1王明春,王正欧,张楷,郝玺龙.一种基于CHI值特征选取的粗糙集文本分类规则抽取方法[J].计算机应用,2005,25(5):1026-1028. 被引量:8
  • 2姜远,周志华.基于词频分类器集成的文本分类方法[J].计算机研究与发展,2006,43(10):1681-1687. 被引量:22
  • 3王丽丽,苏德富.基于群体智能的选择性决策树分类器集成[J].计算机技术与发展,2006,16(12):55-57. 被引量:3
  • 4ZHOU L,TWITCHELL D P,QIN T,Burgoon J. K,NUNAMAKER J F. An Exploratory Study into Deception Detection in Text-Based Computer Mediated Communication[C]//Proceedings of the 36th Annual Hawaii International Conference on System Sciences(HICSS'03),2003.
  • 5ZHOU L,ZHANG D. Can Online Behavior Unveil Deceivers? [C]//Proceedings of the 37th Annual Hawaii International Conference on System Sciences(HICSS'04), 2004.
  • 6CARLSON J R,ZMUD R W. Channel Expansion Theory and the Experiential Nature of Media Richness Persceptions[J]. Academy of Management Journal, 1999,42(2) : 153-170.
  • 7BULLER D,BURGOON J. Interpersonal Deception Theory[J]. Communication Theory ,1996,6:203-242.
  • 8BLAIR J,BURGOON J,STROM R. Heuristics and Modalities in Determining Truth Versus Deception[C]//Proceedings of the 38th Annual Hawaii International Conference on Ststem Sciences(Hicss'05),2005.
  • 9GEORGE J F, MARETT K,Tilley P. Deception Detection Under Varying Electronic Media and Warning Conditions[C]// Proceedings of the 37th Annual Hawaii International Conference on System Sciences(Hicss'04).
  • 10GEORGE J F,MARETT K. Inhibiting Detection and Its Detection[C]//Proceedings of the 37th Annual Hawaii International Conference on System Sciences(HICSS'04), 2004.

共引文献246

同被引文献62

引证文献7

二级引证文献91

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部