261428 Chronic disease prevalence estimates from electronic medical records: Applying capture-recapture to clinical data

Tuesday, October 30, 2012

Carol Conell, PhD, Sociology; MA Mathematics , Division of Research, Kaiser Permanente Northern California, Oakland, CA
Background Prevalence estimates increasingly use electronic medical records (EMR), but the dominant, registry-based approach underestimates the relative prevalence of hard-to-diagnose and those prone to treatment delay, e.g., substance use disorders (UD) and asthma Capture-recapture (C-R) can improve estimates by using the relationship between multiple indicators of disease status to estimate missed as well as identified cases. However, most public health C-R applications have employed sporadic indicators, making it difficult to evaluate the reliability of the estimates. Methods A systematic approach to applying C-R to time series data derived from the EMR is used to estimate UD prevalence among 31,861 members of a membership based health plan who responded to a healthcare survey. Plausible assumptions about continuity of care and the relationship of diagnosis to treatment are tested using 2-period time series for diagnosis and treatment. The resulting, parsimonious 4-indicator model is used to estimate prevalence and bootstrap confidence intervals are calculated. Survey data and an independent screening study are used to evaluate the C-R estimates. Results C-R estimates ascertainment corrected prevalence of 5.8 (5.7-9.6)times the number registered medically, which agrees with intensive screening estimate Estimated prevalence per 100 is 89 for subjects reporting UD on the survey and 7-8 for others. Conclusions Repeated measures readily available with EMR yield stable and parsimonious C-R based prevalence estimates of UD. The ascertainment corrected estimates can improve planning by clarifying the true relative prevalence of more and less readily ascertained diseases.

Learning Areas:
Biostatistics, economics

Learning Objectives:
Identify limitations of registry based estimates of chronic disease prevalence. Discuss how capture-recapture can be used to estimate chronic disease prevalence from clinical data. Assess the suitability of capture-recapture approaches for specific diseases. Design appropriate capture-recapture models for chronic diseases.

Keywords: Data/Surveillance, Epidemiology

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I have been the principal of multiple federally funded grants, focused on developing methods for using routinely collected organizational and clinical data to estimate the prevalence of alcohol and drug related problems. My scientific interests include developing systematic methods for estimating chronic disease prevalence using healthcare data, clarifying the non-statistical sources of error in disease prevalence, and developing a theory of health care data as evidence comparable to the theory of survey data.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.

Back to: 4178.0: Statistical Poster Session