摘要
多源数据集成过程中存在相互冲突情况,消除冲突有利于信息共享和提高用户满意度.针对当前数据冲突消除相关研究成果存在查准率与查全率低的问题,提出基于K-半径子图的多源不平衡噪声数据集成冲突消除方法.利用网络侦听技术,通过将路由器与流量识别工作站放置在同一个物理网段中,所有路经路由器的多源数据包均会经过流量识别工作站完成多源数据采集.根据采集所得数据的K-半径子图描述数据上下文信息,并通过数据K-半径子图对比的方式,将数据相似程度与数据上下文相似程度相结合判定数据是否存在冲突.假设不存在冲突,则按一定规则将数据保存下来;假设存在冲突,则将冲突数据消除.实验结果表明,所提方法具有很高地查准率与查全率,整体运行效果良好,可为该领域发展提供强力支撑.
There is mutual conflict in the multi-source data integration. The elimination of conflicts is conducive to the information sharing and user satisfaction. Therefore, a method to eliminate the integration conflict of multi-source unbalanced noise data elimination method based on K-radius subgraph was proposed. Firstly, the network interception technology was used to put the router and the traffic identification workstation in the same physical network segment, so that all the multi-source data packets passing through the router could complete multi-source data collection by the traffic identification workstation. According to K-radius subgraph of the collected data, the context information of data was described. Moreover, the similarity degree of data was combined with the similarity degree of data context by the K-radius subgraph comparison, and thus to judge whether the data had the conflict. If there was no conflict, the data was saved according to certain rules. If there was a conflict, the conflict data was eliminated. Simulation results show that the proposed method has high precision rate and high recall rate. Meanwhile, the overall operation effect is good, so it can provide strong support for the development.
作者
向军
郑钤
XIANG Jun;ZHENG Qian(College of Information Engineering,Hubei University for Nationalities,Enshi Hubei 445000,China)
出处
《计算机仿真》
北大核心
2019年第10期443-447,共5页
Computer Simulation
关键词
多源
噪声数据
集成
冲突消除
Multi-source
Noise data
Integration
Conflict elimination