[Math] Finding p value matlab

MATLABstatistics

Wondering how to solve the following problem with matlab. Tried applying ttest and ttest 2 to the data but answer is not correct. Can't figure out why. Help would be much appreciated. Thanks!

Microbial activity was measured on 32 samples, giving the following measurements (in mg per sample)

[229, 250, 202, 251, 287, 193, 294, 221, 219, 245, 291, 258, 180, 163, 162, 224, 245, 161, 226, 160, 250, 219, 237, 199, 260, 295, 196, 200, 178, 181, 236, 152]

Then a lime treatment was applied to each of the 32 samples (to balance soil pH), and microbial activity was measured again on the same 32 samples, giving the following measurements (in mg per sample):

[226, 250, 221, 267, 304, 205, 313, 236, 206, 254, 284, 259, 179, 164, 169, 215, 264, 166, 233, 170, 270, 215, 238, 186, 274, 297, 187, 198, 194, 169, 254, 137]

Use Matlab to test for evidence that the lime treatment increased microbial activity, at the 0.1 significance level.

Find a P-value for this test (to 3 decimal places)

Best Answer

I do not have access to Matlab, so I can't help you with syntax for that package.

P-value of one-sided paired t test.

However, assuming data are nearly normal, I agree with the comment by @Raskolnikov that you need a paired t test test. Specifically, a paired test of $H_0: \mu_a - \mu_b = \mu_D = 0$ against the one-sided alternative $H_a: \mu_D > 0.$ Results from R statistical software (slightly edited for relevance) are as follows:

x.a = c(226, 250, 221, 267, 304, 205, 313, 236, 206, 254, 284, 259, 
        179, 164, 169, 215, 264, 166, 233, 170, 270, 215, 238, 186, 
        274, 297, 187, 198, 194, 169, 254, 137)
x.b = c(229, 250, 202, 251, 287, 193, 294, 221, 219, 245, 291, 258, 
        180, 163, 162, 224, 245, 161, 226, 160, 250, 219, 237, 199, 
        260, 295, 196, 200, 178, 181, 236, 152)
d = x.a-x.b
t.test(d, alte="gr")

        One Sample t-test

data:  d
t = 2.2296, df = 31, p-value = 0.01658
alternative hypothesis: true mean is greater than 0
sample estimates:
mean of d 
    4.375 

So the p-value is 0.01658. The test statistic is $T = \frac{\bar D - 0}{S_D/\sqrt{32}} = 2.2296,$ and the (one-sided) p-value 0.01658 is the probability to the right of $T$ under the density curve of Student's t distribution with 31 degrees of freedom. Intermediate computations:

mean(d);  sd(d)
## 4.375      # sample mean of differences
## 11.09999   # sample SD of differences

mean(d)/(sd(d)/sqrt(32))
## 2.229619   # test statistic

1 - pt(2.2296, 31)
## 0.01657741 # p-value

Because the p-value exceeds 1%, you cannot reject $H_0$ to conclude that the lime treatment increases microbial activity. (However, you could have rejected at the 5% level.)

In the figure below, the p-value is the area under the curve to the right of the vertical broken line.

enter image description here

Notes: (1) There is some indication that the differences may not be normal. They barely fail a Shapiro-Wilk test (p-value $\approx 0.05)$ and a normal probability plot reveals that the sample is more short-tailed than normal. Even so, for a sample size as large as $n = 32,$ the t test should be reliable.

(2) The traditional alternative to a t test, a Wilcoxon signed-rank test runs into some difficulty because of tied observations among the differences. But its one-sided p-value is approximately $0.02 > 0.01,$ so it does not reject the null hypothesis (1% level) that the median difference is greater than 0.

(3) A one-sided simulated permutation test on paired differences gives p-value about 0.17, essentially the same as the paired t test.