Solved – Is using a questionnaire score (EuroQol’s EQ-5D) with a bimodal distribution as outcome in linear regression a problem

multinomial-distributionnormal distributionnormality-assumptionratingregression

There is currently a debate whether the EQ-5D score that has a ceiling problem and a bimodal distribution can be used in a linear regression model or not.

Background

The score is very simple and frequently used to assess patient's health related quality of life and consists out of five questions where each has 3 possible answers (there is a newer with 5 answers but it's less used).

The score is commonly used in national registries as a patient reported outcome measure (PROM) and is very convenient because the questions are easy to answer and the completeness is therefore good.

Continuous score

The score is created by using a "tariff" where unique combinations of the 5 variables translate into a continuous-like variable but with the above mentioned limitations. I'm not sure how they decide on the tariff score but the different combinations of the answers combine into a unique value, for instance if you have answered best health on all five categories you get a code of 11111 that gives the maximum of 1.000. If you've answered best on the first 2 question and worst on the last 3 you have a code of 11333 and get a score of -0.066. The score is country-adjusted and ranges between -0.594 to 1.000 in my Swedish tariff.

The Paretian calculation

In most orthopaedic studies we have a preoperative score and a postoperative score. By comparing the two models as the Paretian Classification of Health Change suggests we get four possible outcomes; no change, worse, improved, or mixed change. Mixed meaning that one category became improved while another one deteriorated. As I understand the Paretian outcome is best analyzed using a multinomial logistic model.

My questions

  • When having large datasets of > 10 000 patients does it matter that the score is not normally distributed and is the Paretian way of analyzing the score better?
  • Scores like this are very frequently used today – what are the limitations?

Update

After taking all these wise arguments and discussing them closer with our statistician I got some interesting input:

  1. In large sample the central limit theorem will kick in as long as the sample isn't heavily skewed
  2. If the score itself has a flaw (as the EQ-5D score) it might not be right to expect a normal distribution because the bimodality is not due to a subgroup but due to a score feature (I think this is a different way of putting what @whuber wrote: "… The residuals will closely reflect that error distribution"
  3. The normality of the sample helps in calculating the p-value/confidence interval and this could be circumvented by using bootstrapping
  4. Using ordinal regression and leaving out the mixed group we can validate the results from the linear regression – i.e. show that the predictors behave similarly when used "non-parametric"

Best Answer

First, categorizing continuous variables is generally a bad idea; Royston, Altman and Saurbrei wrote a good article on why dichotomizing is bad, and the same arguments apply to more categories. Altman wrote an article on categorizing variables, but only the abstract is freely available, and I have not read the whole article.

Second, the assumptions of linear regression are not that the dependent variable is normally distributed, but that the residuals from the model are. So, before you can see if your model violates the assumptions, you need to run it and look at the results.

Third, if the residuals are not normally distributed, you have several choices:

  1. Multinomial logistic regression with the four categories you list
  2. Ordinal logistic regression with "mixed" excluded.
  3. Looking at each category separately
  4. Some sort of robust regression

Before doing any of these, my impulse would be to look at the variables graphically, with density plots and possibly quantile normal plots.

Related Question