摘要
对企业虚假信息进行优化识别,可避免由于虚假信息导致的企业亏损和大量的不良资产的产生。进行虚假信息识别时,应以虚假信息发生的概率为指数,将信息安全阈值和虚假信息模板向量集的相似度阈值作为信息识别指标完成识别,但是传统方法通过专家经验,判断代表性信息的虚假性确定其所在聚类的所有信息的虚假性完成识别,但是不能准确计算虚假信息发生的概率,无法获取信息安全阈值和虚假信息模板向量集,存在信息识别误差大。提出一种基于改进K均值的企业对虚假信息优化识别方法。上述方法依据非线性特征提取方法在信息核空间内搜索出具有信息特征判别能力的投影方向,获取不同信息特征样本,引入核函数提取企业信息的非线性互信息特征,弥补了当前识别方法无法提取虚假信息的特征样本的弊端。组建信息特征线性系数矩阵,利用K均值方法对特征样本空间进行分类,并进行欧式距离计算,计算出类间子聚类中心距离矩阵和各聚类中心的邻界聚类区,以虚假信息在信息中发生的概率为指数,将信息安全阈值和虚假信息模板向量集的相似度阈值作为信息识别指标,完成对企业虚假信息的识别。仿真结果表明,所提方法识别精确度高,为识别企业虚假信息提供了有效依据。
This research proposes a method for optimization recognition of false information of enterprise based on modified K mean value. Our research searched out the projection direction which has the ability of information feature judging in information nuclear space according to extraction method of nonlinearity feature and obtained different infor- mation feature sample, then introduced kernel function to extract mutual information feature with nonlinearity to make up the disadvantage that current method can not extract feature sample space of false information. Moreover, the re- search built linear coefficient matrix of information feature and used K- means method to classify feature sample space, and carried out Euclidean distance calculation. The research also worked out center distance matrix of sub- cluster in cluster and adjacent cluster region of each cluster center, and used occurrence probability of false informa- tion in information as the index. Finally, the threshold value of information security and similarity threshold of tem- plate vector of false information were used as recognition index. Thus, we completed the identification. Simulation re- sults show that the method has higher recognition precision. It can provide effective basis for the recognition.
出处
《计算机仿真》
北大核心
2017年第5期313-316,共4页
Computer Simulation
关键词
企业
虚假信息
识别
核函数
Enterprise
False information
Recognition
Kernel function