Back to Annual Meeting
Back to Annual Meeting
APHA Scientific Session and Event Listing

Evaluating data error using data cleaning methodology in SAS- Apublic health informatics application of data cleaning

Fayomi Martin, MPH1, Ramon Ismaila, MPH1, and Akinfaderin Fadekemi, MPH2. (1) Reseach Sector, Statscorp Analysts, 1260 Lilac Arbor Road, Dacula, GA 30019, 770-906-9006, mfayomi@statscorp.com, (2) 1959 A2 Tunis Street, Wuse, Zone 6, Abuja, FCT, Nigeria

Purposes: To evaluate the level of data entry error in using manual process in data entry. We examined questionnaire flow and the type of entry error that such skip instruction had on the data. Methods: Data collected during questionnaire survey of HIV/AIDS and behavioral health questionnaire in Nigeria were entered into an excel spread sheet with the guide of a code book. The excel data was exported into SAS dataset and ran against a skip pattern and data range integrity check to assess the level of data entry/coding error before data analysis can commence. SAS software procedure codes were used to check the responses coded for each variable in the dataset. Results: 98% of the variables had an invalid coded response due to human error. For every variable with invalid response, an out of range response was identified. Discussion: Data cleaning, although not a new concept is very important before proper data analysis should commence. This is even more significant when a manual data entry technique is the basis for inputting survey response. By using procedure codes in SAS software, we developed the best data cleaning methodology for rural areas where web-based data systems does not exist

Learning Objectives:

Keywords: Health Information,

Presenting author's disclosure statement:

Not Answered

HIIT Poster Session

The 134th Annual Meeting & Exposition (November 4-8, 2006) of APHA