Objectives We aim to estimate geographic variability in total numbers of infections and infection fatality ratios(IFR;the number of deaths caused by an infection per 1,000 infected people)when the availability and qua...Objectives We aim to estimate geographic variability in total numbers of infections and infection fatality ratios(IFR;the number of deaths caused by an infection per 1,000 infected people)when the availability and quality of data on disease burden are limited during an epidemic.Methods We develop a noncentral hypergeometric framework that accounts for differential probabilities of positive tests and reflects the fact that symptomatic people are more likely to seek testing.We demonstrate the robustness,accuracy,and precision of this framework,and apply it to the United States(U.S.)COVID-19 pandemic to estimate county-level SARS-CoV-2 IFRs.Results The estimators for the numbers of infections and IFRs showed high accuracy and precision;for instance,when applied to simulated validation data sets,across counties,Pearson correlation coefficients between estimator means and true values were 0.996 and 0.928,respectively,and they showed strong robustness to model misspecification.Applying the county-level estimators to the real,unsimulated COVID-19 data spanning April 1,2020 to September 30,2020 from across the U.S.,we found that IFRs varied from 0 to 44.69,with a standard deviation of 3.55 and a median of 2.14.Conclusions The proposed estimation framework can be used to identify geographic variation in IFRs across settings.展开更多
基金K.A.and J.L.were supported by a grant from the Benioff Center for Microbiome MedicineThis research used resources of the Oak Ridge Leadership Computing Facility,which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725+3 种基金This manuscript has been coauthored by UT-Battelle,LLC under contract no.DE-AC05-00OR22725 with the U.S.Department of EnergyThe United States Government retains and the publisher,by accepting the article for publication,acknowledges that the United States Government retains a nonexclusive,paid-up,irrevocable,world-wide license to publish or reproduce the published form of this manuscript,or allow others to do so,for United States Government purposesThe Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan(http://energy.gov/downloads/doe-public-access-plan,last accessed September 16,2020)Work at Oak Ridge and Lawrence Berkeley National Laboratories was supported by the DOE Office of Science through the National Virtual Biotechnology Laboratory,a consortium of DOE national laboratories focused on response to COVID-19,with funding provided by the Coronavirus CARES Act,and was facilitated by previous breakthroughs obtained through the Laboratory Directed Research and Development Programs of the Lawrence Berkeley and Oak Ridge National Laboratories.M.P.J.was supported by a grant from the Laboratory Directed Research and Development(LDRD)Program of Lawrence Berkeley National Laboratory under U.S.Department of Energy Contract No.DE-AC02-05CH11231.Oak Ridge National Laboratory would also like to acknowledge funding from the U.S.National Science Foundation(EF-2133763).
文摘Objectives We aim to estimate geographic variability in total numbers of infections and infection fatality ratios(IFR;the number of deaths caused by an infection per 1,000 infected people)when the availability and quality of data on disease burden are limited during an epidemic.Methods We develop a noncentral hypergeometric framework that accounts for differential probabilities of positive tests and reflects the fact that symptomatic people are more likely to seek testing.We demonstrate the robustness,accuracy,and precision of this framework,and apply it to the United States(U.S.)COVID-19 pandemic to estimate county-level SARS-CoV-2 IFRs.Results The estimators for the numbers of infections and IFRs showed high accuracy and precision;for instance,when applied to simulated validation data sets,across counties,Pearson correlation coefficients between estimator means and true values were 0.996 and 0.928,respectively,and they showed strong robustness to model misspecification.Applying the county-level estimators to the real,unsimulated COVID-19 data spanning April 1,2020 to September 30,2020 from across the U.S.,we found that IFRs varied from 0 to 44.69,with a standard deviation of 3.55 and a median of 2.14.Conclusions The proposed estimation framework can be used to identify geographic variation in IFRs across settings.