Session
Machine Learning and HIV Treatment Cascade: Findings from the Big Data Analytics HIV/AIDS Health Utilization Project in South Carolina.
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
Abstract
Machine learning approaches to understanding predictors of comorbidity among people living with HIV in electronic health record data
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
Methods Extracted through electronic reporting system in Department of Health and Environmental Control in SC, the study population was PLWH diagnosed between Jan 2005 and Dec 2016 and living in South Carolina (SC). The severity of comorbidity was measured by the dichotomized age adjusted CCI score (high: >4; low: CCI ≤4) calculated from the weighted sum of the presence of 19 health conditions. Nineteen risk predictors were used to predict the severity of comorbidity based on the least absolute shrinkage and selection operator (LASSO) regression and classification and regression tree (CART) analysis, where data was split into two sets with 80% for training and 20% for validation.
Results Of 5989 patients, the median CCI score was 4 (range: 2-22). Both models demonstrated good prediction accuracy where AUC is 0.76 (95% CI: 0.75-0.77) for LASSO procedure and 0.72 (95% CI: 0.69-0.75) for CART. Top predictors in both models include older age, higher percentage of retention in care and viral suppression, MSM or IDU, and longer duration of days with low CD4.
Discussion The machine learning methods could identify the most important predictors of comorbidity among PLWH with high accuracy. Results may enhance the understanding of comorbidity and provide the data-based evidence for future care management of PLWH.
Chronic disease management and prevention Epidemiology Public health or related public policy Social and behavioral sciences
Abstract
Contextual factors with county-level retention in care status among people living with HIV in South Carolina from 2005 to 2016
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
Biostatistics, economics Chronic disease management and prevention Planning of health education strategies, interventions, and programs Social and behavioral sciences
Abstract
Using big data and machine learning to predict missed opportunities for HIV diagnosis in South Carolina
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
Early HIV diagnosis is key to ending the HIV epidemic. Big data and machine learning can develop better predictive tools for targeted HIV testing. We developed and validated a prediction model to identify predictors of missed opportunities for HIV testing.
Methods:
The SC enhanced HIV/AIDS Reporting System and records from a statewide all payer health care (HC) database were linked. Analysis includes individuals diagnosed with HIV in SC from 01/2013-12/2016 and all HC visits from 2005 to HIV diagnosis. Late testers (LT) were defined as initial CD4 <200 cells/mm3. For LT, all HC visits within eight years before HIV diagnosis were included as missed opportunities. For non-LT, visits occurring within three years were included. We applied least absolute shrinkage and selection operator (LASSO) regression and classification and regression tree (CART) analysis to identify independent predictors of missed opportunities for HIV diagnosis.
Results:
2693 new HIV diagnosed were identified, 743 (27.6%) were LT. 1987 (73.4%) had at least one HC visit prior to their HIV diagnosis, mean number of visits was 6.2. Predictors in both models were age, gender, race/ethnicity, transmission mode, LT, rural/urban residence, year of diagnosis and sexual transmitted infection. In both the CART and LASSO procedure the most important variables were race, LT and gender with an AUC of 0.58 (95% CI: 0.53 -0.63) and 0.69 (95% CI: 0.67-0.71), respectively.
Conclusion:
Prediction models using machine learning techniques can identify predictors of “missed opportunities” for HIV diagnosis. These techniques will allow more precise targeting of HIV testing efforts.
Epidemiology Planning of health education strategies, interventions, and programs Public health or related research
Abstract
Machine learning modeling framework to predict treatment linkage to care in patients newly diagnosed with HIV in mecklenburg county, NC
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
objectives: We aim to identify and quantify important risk factors associated with delayed LtC for patients newly diagnosed with HIV with novel ML models.
methods: Deidentified 2013-2017 Mecklenburg County surveillance data (eHARS) were requested. Univariate analyses were used to quantify associations between delayed LtC (i.e., LtC>30d after diagnosis) and demographic, epidemiological, geographic, and clinical factors of HIV carriers. ML models, including random forest model, were then developed and validated in R 3.5.0 to the same data to predict risk of delayed LtC of individual HIV carriers.
results: Types of HIV-diagnosing facility significantly influenced time to LtC; first diagnosis in hospital is associated with the shortest time for LtC. HIV patients with lower CD4 counts (<200 RNA copies) are twice as likely to LtC within 30d than those with higher CD4. Random forest model achieves high accuracy (>80% without CD4 data and >95% with CD4 data) to predict individual risk of LtC delay.
conclusions: This study combines advantages of interpretable hypothesis-driven and state-of-the-art ML methods to achieve a more comprehensive understanding of challenges in LtC delays. These findings provide personalized recommendations for individual patients to better understand their own care continuum. They also help public health teams identify high-risk communities across Mecklenburg County.
Assessment of individual and community needs for health education Clinical medicine applied in public health Epidemiology Protection of the public in relation to communicable diseases including prevention or control Public health or related research
Abstract
Application of machine learning techniques in classification of HIV medical care status for people living with HIV in South Carolina.
APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)
Methods: Linked data from the SC enhanced HIV/AIDS Reporting System and a statewide all payer database consisting of 233482 observations were split into three (training [40%]; Validation [30%]; and Test [30%]). We examined associations between the binary target “care status” (In care vs. not in care), and 43 inputs (explanatory variables). We compared multiple classification algorithms such as deep neural networks, automated neural networks, decision trees and regression. We focused on three main goals namely future case prediction, hidden input selection and complexity optimization. We compared models by examining model classification performance using standard machine learning measures and receiver operating curves (ROC).
Results: Preliminary analyses showed combination inputs most predictive of being not in care (tobacco use, heterosexual, Black and age). Conversely, inputs most predictive of being in care included age, prior year care status, schizophrenia, CD4, and transmission risk. Model performance ROC were best for neural networks, ensembles and gradient boosting. Trade-offs were required for each model with neural networks (Sensitivity: 60.7%; misclassification: 33.7%) and ensemble (Sensitivity: 60.6%; misclassification: 35.2%) the best classifiers of care status in the validation model.
Conclusion: These algorithmic applications of neural networks and other machine learning techniques holds significant promise for predicting future states of PLWH HIV care status.
Epidemiology Public health or related research