摘要
在政务大数据中心的数据治理过程中,不同政务场景下,由于管理策略不同,业务过程对实体对象的相同属性信息的记录会有不同的数据编码结构,这就是政务数据多源融合过程中常见的一种难题。与常规方法不同,本文通过引入统计学的列联相关分析法,解决了不同业务场景的异构法人登记属性融合问题,建立了准确的映射关系。此实践将统计学方法应用到多源异构政务数据融合过程中,不仅快速、低成本的解决了实际问题,并且对于解决其他数据融合问题具有较高的参考价值。
In the data governance process of government big data center, under different government scenarios, due to different management strategies, the records of the same attribute information of entity objects in business processes will have different data coding structures, which is a common problem in the process of multi-source fusion of government data. Different from the conventional methods, this paper solves the problem of heterogeneous legal person registration attribute fusion in different business scenarios by introducing the statistical column correlation analysis method, and establishing an accurate mapping relationship. This practice applies statistical methods to the process of multi-source heterogeneous government data fusion, which not only solves practical problems quickly and at low cost, but also has the high reference value for solving other data fusion problems.
出处
《数据挖掘》
2022年第1期80-89,共10页
Hans Journal of Data Mining