Abstract
Continuous covariates in a logistic regression model have been often divided into categories to avoid a potential non-linearity, especially when covariates do not follow normal distributions. However, categorization may lead a considerable loss of power depending on the covariate distribution shape. Therefore, we investigate the impact of the covariate distribution characteristics on the power in logistic regression models when continuous covariates are converted to categorical covariates. We consider the uniform, bell-shaped, right-skewed, and left-skewed distributions and assume that the relationship between the original continuous covariate and the corresponding logit outcome is linear. Continuous covariates are categorized into quantiles (median, quartile, or quintiles). The statistical power and regression coefficients are estimated using simulations for continuous covariates and categorical covariates. When continuous covariates were converted to categorical covariates, the power decreased for any covariate distribution shape. In particular, the increase in the number of categorized groups led to a decrease in power. However, the ranking order of powers among the four distributions were not changed owing to categorization. Although the power decreases because of categorization, the impact of covariate distribution characteristics on the power in logistic regression models may not be changed by categorization.