216699 Identifying subsets of complex patients with cluster analysis

Monday, November 8, 2010

Sophia Newcomer, MPH , Institute for Health Research, Kaiser Permanente Colorado, Denver, CO
Elizabeth Bayliss, MD, MSPH , Institute for Health Research, Kaiser Permanente Colorado, Denver, CO
John Steiner, MD, MSPH , Institute for Health Research, Kaiser Permanente Colorado, Denver, CO
Background: Most investigations into causes of high utilization or hospitalization have been conducted using regression analyses to identify individual predictors of the outcome of interest. Subsequent efforts at care management often focus on these single factors in hopes of reducing adverse health outcomes. However, most patients are not characterized by single clinical characteristics, but rather by complex constellations of medical conditions and biopsychosocial features. Objective: To demonstrate how data mining procedures, such as agglomerative hierarchical clustering, can be used to identify clinically relevant groups of high-cost patients with similar multimorbidities within a large health maintenance organization (HMO) setting. Methods: We identified Kaiser Permanente Colorado HMO members in the top 20% of system-wide patient costs for two consecutive years (2006 and 2007). The study cohort was then limited to members with at least two common chronic conditions (n=13,312). A proximity matrix of Jaccard's coefficients was created to represent the similarity between each member of the cohort. Ward's minimum variance method was used for the clustering analysis. The number of clinically relevant clusters was unknown prior to the analysis. Clinical reviews of the clustering solutions, along with evaluation of relevant statistics, were used to identify the most parsimonious clustering solution. Results: Eleven clinically relevant clusters were identified. One common theme was mental health conditions co-occuring with other chronic conditions. Two clusters – one characterized by younger adults with obesity and mental health issues, and one characterized by patients with chronic pain and mental health issues – were identified as potential groups for enhanced care management. Conclusions: Data mining procedures such as cluster analysis can be used for identifying distinct groups of patients with similar comorbid conditions. Such methods can be leveraged for targeted care management interventions designed to improve patient outcomes and lower healthcare costs.

Learning Areas:
Chronic disease management and prevention
Communication and informatics

Learning Objectives:
Explain how data mining procedures such as cluster analysis can be applied for identification of relatively homogeneous groups within a large dataset Design a cluster analysis for a large cohort using binary variables, such as presence of a chronic condition (Yes/No) Discuss how data mining applications can be a valuable tool within health informatics for designing targeted clinical interventions

Keywords: Health Information Systems, Chronic Diseases

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I am a biostatistician with Kaiser Permanente Colorado's Institute for Health Research.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.