Abstract

Data management in substance use disorder treatment research: Implications from data harmonization of NIDA-funded randomized controlled trials

Ryoko Susukida, PhD1, Masoumeh Aminesmaeili, MD, MPH1, Evan Mayo-Wilson, MPA, DPhil2 and Ramin Mojtabi, MD, PhD1
(1)Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, (2)Indiana University School of Public Health-Bloomington, Bloomington, IN

APHA's 2020 VIRTUAL Annual Meeting and Expo (Oct. 24 - 28)

Background: Secondary analysis of data from completed randomized controlled trials (RCTs) is a critical and efficient way to maximize the potential benefit from past research. De-identified primary data from completed RCTs have been increasingly available in recent years; however, the lack of standardized data products is a major barrier to further use of these valuable data. Harmonizing data structure, variables and codebooks across RCTs would facilitate secondary data analysis including meta-analysis and comparative effectiveness studies. We describe a data harmonization initiative to harmonize de-identified primary data from substance use treatment RCTs funded by the National Institute on Drug Abuse (NIDA) available on NIDA Data Share website.

Methods: Harmonized datasets with standardized data structures, variable names, labels, and definitions and harmonized codebooks were developed for 35 completed RCTs. Common data domains were identified to bundle each data file according to its relevant subject area. Variables within the same instrument were harmonized if two or more RCTs used the same instrument. The structures of the harmonized data were determined based on the feedback from clinical trialists and substance use treatment research experts.

Results: We have created a harmonized database of variables across 35 RCTs with a build-in label, and a brief definition for each variable. All the data files for each RCT have been consistently categorized into eight data domains (adherence, adverse events, demographics, enrollment, physical health measures, mental-behavioral-cognitive health measures, self-reported substance use measures, and biologic substance use measures). Harmonized codebooks and instrument/variable concordance tables have been also developed to help identify instruments and variables of interest more easily.

Conclusions: The harmonized data of RCTs of substance use treatments can potentially promote future secondary data analysis of completed RCTs and provide guidance for future RCTs in substance use treatment research.

Clinical medicine applied in public health Public health or related research