160666 Geomasking algorithms to protect confidentiality of sexually transmitted infections in spatial epidemiology

Monday, November 5, 2007

Molly Fitch , Department of Epidemiology, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC
William Allshouse , Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC
Marc Serre , Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC
Kristen Hampton , School of Medicine-Division of Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC
Dionne Gesink Law , Department of Microbiology, Montana State University, Bozeman, MT
Peter A. Leone, MD , School of Medicine-Division of Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC
William C. Miller, MD; PhD, MPH , School of Medicine; Department of Epidemiology, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC
Background: As GIS use has become more influential in public health research, the demand for higher spatial resolution of sensitive health data has increased. However, enhancing resolution can encroach on patient confidentiality. Current available methods include tradeoffs with regard to these criteria.

Objective/Purpose: In order to mask the address location of individuals with sexually transmitted infections, we have created a geomasking technique, which we refer to as "Donut Geomasking." In donut geomasking each geocoded address is relocated in a random direction by at least a minimum distance, but less than a maximum distance, while retaining the address in its original census block group. This method should be effective, especially for protecting the locations of cases in rural areas.

Methods: Households were simulated to create block groups with differing population densities. We compared the donut geomask to the established methods of aggregation and simple random perturbation to assess performance and protection of confidentiality. Three metrics were used to evaluate the different methods: (1) the average displacement divided by the distance to neighbors, (2) the percentage of cases where “minimum confidentiality” was violated, and (3) the change in spatial resolution. For the donut geomask, the radii varied to contain 0-10 households in the inner radius and 50-100 for the outer radius.

Results: Random displacement methods performed much better in preserving the integrity of the spatial relationship between points compared to aggregation. The donut geomask violated confidentiality 1-6% of the time compared to 10-50% for simple random perturbation, depending on the minimum and maximum protection levels chosen.

Discussion/Conclusions: Aggregating spatial epidemiological data should be avoided when possible. This method destroys one's ability to identify outbreaks at the microscale level. While simple random perturbation avoids this, if the data are highly sensitive in nature, the donut method should be considered optimal for geomasking.

Learning Objectives:
1. The participant should be able to understand why preserving confidentiality is important, especially when patient addresses are available for geocoding. 2. The participant should be able to describe why it is often undesirable to simply aggregate spatial data to a politically defined region. 3. The participant should be able to explain why different measures should be taken into account when geocoding in rural areas compared to urban areas.

Keywords: Geocoding, HIV/AIDS

Presenting author's disclosure statement:

Any relevant financial relationships? No
Any institutionally-contracted trials related to this submission?

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.