176940 Sample size and power for continuous and categorical variables

Tuesday, October 28, 2008

T. Robert Harris, PhD , Biostatistics, UT School of Public Health, Dallas, TX
When relating one or more predictors to a dependent variable, data analysts sometimes categorize continuous variables. The effect of such categorization on power and sample size was examined via simulation, in several situations. These include the case where the correct model is in fact linear dependence of outcomes on continuous predictors; that in which the relationship is correctly modeled by a categorical predictor; and situations which fall between these poles. Additionally, linear and logistic regressions using continuous and categorized versions of the same dependent variable are analyzed. Comparisons depend on effect size and on correlations among independent variables in multiple regression.

Differences in power, and in sample size needed to detect a given effect with given power, are substantial. Differences in both power and sample size are threefold or more in some cases.

The existence of differences is not unexpected, in view of the statistical theory of efficient estimators and most powerful tests. However, their magnitude may be surprising. Therefore it can be important to have sound reasons for deciding whether and how to categorize.

Learning Objectives:
1. Recognize the effect of model specification on statistical power and required sample size. 2. Include alternative specifications as a consideration when planning studies. 3. Evaluate decisions whether or how to categorize variables when analyzing data.

Keywords: Statistics, Data Collection

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I have a Ph.D. and 18 years of work experience in statistics, the subject of this abstract.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.