摘要
目的为解决全国医疗机构法定传染病报告质量调查过程中现场调查数据与传染病网络直报系统记录匹配问题,采用概率数据匹配方法对不同来源的信息进行匹配。方法采用改良的Fellegi-Sunter概率数据匹配方法,对匹配项系数进行赋值,分别计算每一配对记录之间相似性得分,若匹配相似性得分超过一定的阈值(cut-off值)后,即认为匹配成功。对自动匹配结果进行人工核对,并作为金标准,对自动匹配结果进行评价。结果将调查过程中获取的2153条原始记录与网络直报系统中97 271张传染病报告卡信息进行分层多维度概率匹配。以总得分25分作为阈值,将自动匹配结果与人工判断结果比较。结果显示,自动匹配的灵敏度为98.96%(95%CI:98.39%~99.36%),特异度为94.92%(95%CI:91.29%~97.35%),总一致率为98.51%(95%CI:97.91%~98.98%),Kappa值为0.9250,ROC曲线下面积为0.9979。结论分层多维度概率匹配方法成功解决了现场调查的原始数据与网络报告系统的数据匹配问题,匹配结果与实际情况具有较高的一致性,显著提高了工作效率,也为今后开展类似工作提供简易的分析工具。
Objective To match the records of communicable disease reporting information from field survey and that from communicable disease reporting system. Methods An improved method originated from Fellegi and Sunter on probabilistic record linkage techniques was used to assign similarity scores to pairs of records and treats all pairs that score above a certain threshold as matches. The probabilistic record matching results were verified manually and then the accuracy of the probabilistic record matching results was compared with the manual results. Results A total of 2153 records form a field survey for communicable disease reporting quality were stratified and matched with 97,271 records from communicable disease reporting system. The score 25 was used as the threshold. The accuracy of the probabilistic record matching method was compared with manual results. The results showed that the sensitivity of probabilistic record matching was 98. 96%( 95% CI: 98. 39%- 99. 36%),the specificity was 94. 92%( 95% CI: 91. 29%- 97. 35%),the total concordance rate was 98. 51%( 95% CI: 97. 91%- 98. 98%),the Kappa value was 0. 9250 and the area under the ROC was 0. 9979. Conclusion Based on the basic theory of probabilistic record linkage,the records from two different sources were successfully matched and the results showed high accuracy and consistence.
出处
《疾病监测》
CAS
2015年第9期792-795,共4页
Disease Surveillance
基金
国家重大传染病防治专项(No.2013ZX10004218-06-006)~~
关键词
概率数据匹配方法
传染病报告
准确性
Probabilistic record linkage method
Communicable disease reporting information
Accuracy