Solved – How to test hypothesis for group differences

hypothesis testingt-test

I'm analyzing an experiment results. The experiment (unfortunately) was very bad designed .. anyhow this is what I have ….

In this experiment two group of patients: healthy control and with X disease. The sick people came for three visits: during the first visit they took no medication at all. During the second meeting they took Type 1 medication, and during the third meeting they took Type 2 medication. During each meeting blood samples for testing was taken from the sick people. The control group of healthy subjects arrived for only one visit during which they also had blood tests. All blood test measured the amount of Y protein.
To summarize, I have four kinds of Y protein samples: from healthy people; from sick people that did not receive medication; from sick people on Type 1 medication; from sick people on Type 2 medication.
I have three hypotheses that I would like to test –

  1. That there is a clear statistical difference between the
    sick and healthy people (for the duration of all the visits),
    that is, from an aspect of protein amounts, both types of
    medication did not raise the level of protein to that of the
    healthy people.
  2. That the medication did improve the level of protein in sick people, and the amount of protein is indeed statistically
    different between sick people on medication, and in sick people
    not on medication.
  3. That there is no statistical difference in the amount of protein that measured between the two types of medication.

my question is: how should I design my t-tests for testing those three hypothesis ( contrast ?) ? while avoiding overlap tests ( orthogonal ?) puzzled… any help would be appreciated!

Best Answer

The problem is that, as you say, this is a very poorly designed experiment. You have no control group of sick people who didn't get medication; no group of sick people who got Type 1 but not Type 2; and no group who got Type 2 and not Type 1. I think that no amount of statistics will let you reliably test your second and third hypotheses. For example, if you find that their protein levels have changed after they get Type 2 treatment, you will have no way of deciding if the change comes from a delayed effect from Type 1, or just a general natural effect from time. So I won't offer any suggestions for testing those hypotheses as any result will be misleading.

Your first hypothesis you can test if and only if you are confident that people do not get better without treatment. You could not conclude this from your experiment, so you would need to know this from other experience eg clinical experience with this illness that people do not get better naturally. I've no idea if this is realistic or not.

Assuming the condition in the above paragraph is correct, I would measure the difference in the sick people's protein levels at the end of the experiment (after they got both treatments) from their protein levels at the beginning (when they turned up sick but before getting any treatment).

You first look for evidence that the protein levels have increased by a positive amount during this duration. This would be a one-sided t test, based on the differences (hopefully improvements) measured above, comparing it to zero.

The second part of your hypothesis was that the improvement brings the sick people up to the level of the well people. Assume there is no controversy about the fact that the illness reduces protein levels in the first place (as this wasn't one of the hypotheses you wanted to check). In this case, compare the average protein level in the sick group at the end of the experiment with the average protein level in the well group. Again, this is a one-sided t test (assuming protein levels are normally distributed), but this time based on comparing the two average protein levels (as opposed to in the para before where it was based on average improved protein compared to zero).

I don't think the set of measurements after treatment 1 but before treatment 2 can tell us anything.

You will find it easier to analyse this in R than Matlab, I think - R has many more statistical functions built in and ready to go for the user. However, if my answer above is right, you only need to do t-tests, which are pretty straightforward. I would advocate some graphical data analysis as well - if only to check for plausibility, outliers, distributions, etc - which will certainly be easier in R.

Related Question