摘要
欺骗信息检测是信息安全领域中的重要研究内容.现有的研究表明,三分之一的人际交往中会涉及到潜在的欺骗,大量的欺骗信息充斥在各种各样的通信媒介中,在海量的网络信息中欺骗性数据的规模通常远小于非欺骗性数据的规模,已有方法还不能很好地适应于准确高效地欺骗检测,迫切期望提出一种能高效地检测欺骗信息的方法.针对具有非平衡性的海量网络信息,提出了一种基于集成学习的欺骗行为检测方法.通过改进的二分k-means划分方法对训练样本集进行分解,分别在每对正负样本集上学习各自独立的分类器,然后利用每个独立分类器分别计算待测样本的类别输出值,并采用结合个体分类器分类正确率的最小最大模块化方法集成每个判别结果.实验结果验证了该方法的有效性.
Deception detection is important in the field of information security. Existing researches show that one third of the interpersonal communication involves the potential deceptions, and there are large amounts of deceptive messages in the more and more Web information. If the deception is potentially dangerous to people's life, the survival of enterprise and the stability of the country, then the negligence of deception may lead to incalculable loss. In the massive amounts of information the scale of the non-deceptive texts is much larger than the scale of the deceptive texts, so people remain unsuccessful and inefficient in detecting those deceptive messages by the existing methods, and it is desirable to create an automated method which could help people flag the possible deceptive messages. In this paper, we built a deception detection model based on ensemble learning to solve the imbalance of the existing data sets. Firstly a novel bisecting k-means method is proposed to cut the training sample set, and the separate classifiers are trained by using each pair of positive and negative samples, and then each test sample category value is calculated by the classifiers, and finally a novel min-max modular approach is used to integrate each category result. Experimental results verify the effectiveness of this method.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2015年第5期1005-1013,共9页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61005053
61100138
61373082
61322211)
国家"八六三"高技术研究发展计划基金项目(2015AA015407)
新世纪优秀人才支持计划基金项目(20121401110013)
山西省回国留学人员科研资助项目(2013-022)
山西省高等学校科技创新项目(2015104)
中国民航大学信息安全评测中心开放课题基金项目(CAAC-ISECCA-201402)
关键词
欺骗
欺骗检测
集成学习
样本划分
最小最大模块化支持向量机
deception
deception detection
ensemble learning
cutting samples
min-max modular support vector machine (M3-SVM)