223466 Ranking Important Risk Factors Which Affect United States Syphilis Rates Based on Support Vector Machine

Tuesday, November 9, 2010

Chau-Kuang Chen, EDD , Institutional Research, Meharry Medical College, Nashville, TN
Amirah Abdullah, BS , School of Graduate Studies and Research, Meharry Medical College, Nashville, TN
Syphilis is a sexually transmitted disease caused by bacteria that remains a public health concern in the United States. In 1991, the percentage of persons living with syphilis in the U.S. was at an all time low of 19%, but that number has increased substantially to 40.8% in 2007 (CDC Wonder, 2008). Previous studies have linked syphilis to such variables as education, sexual activity, homosexuality, and single family homes. The purpose of this study is to rank the important risk factors that contribute to syphilis incidence rate in the United States. Data was collected on 50 states in the U.S from 1984-2007 and analyzed using a machine learning algorithm, Support Vector Machine (SVM). This is a procedure that is used to map input data from input space into a higher-dimensional feature space, and seeks an optimal hyperplane to separate data from multiple classes. This will be used to rank, in level of importance, risk factors that have an impact on syphilis. Risk factors such as state median income, state unemployment rate, state poverty rate, and state alcohol consumption were entered into the model. Our research findings show that incidence rates for syphilis in the U.S. are ranked, in order, median income, state poverty rates, state alcohol consumption, and state unemployment rate. This finding is vital as it emphasizes a population in which other socioeconomic risk factors are present. Prevention efforts should be directed towards each state in the U.S. in order to reduce the rate of syphilis infections.

Learning Areas:
Biostatistics, economics
Public health or related education

Learning Objectives:
1) To familiarize the audience with the machine learning algorithm Support Vector Machine Classifier 2) To rank the important risk factors that affect syphilis rates in the United States 3) To demonstrate the practical use of the Support Vector Machine Classifier

Keywords: STD, Statistics

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I am qualified to present because I conducted literature review and data analysis under the supervision of Dr. Chau-Kuang Chen.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.

Back to: 4154.0: Poster Session