MATLAB: How to use two -tail t-test

ttest

I have two set of data (x and y)and I want to show that x is bigger than y by using the two-tail t-test. I know that the if I want to check the tail I need to type 'Tail','right','left', or 'both'. But I do not understand how to show this.

Best Answer

If your hypothesis is ‘greater than’ or ‘less than’, use a one-tailed test. If your hypothesis is ‘different than’, use a two-tailed test.

The way to code it (for ttest2), is for example to test that the mean of ‘x’ is less than the mean of ‘y’:

[h,p,ci,stats] = ttest2(x, y, 'Tail','left');

Examine your data and consult the documentation to determine the correct option.

Related Solutions

MATLAB: Hypothesis testing in matlab

A common mistake in statistics is to work backwards by first selecting a test, or worse, several tests, and then selecting which test to use. This practice leads to p-hacking which some people call "bad science" but in reality, it's not science at all and is a breach of the scientific method. This has recently led more than 800 academically affiliated statisticians and scientists to sign a commentary in Nature that suggests ending significance testing all together!

ttest() vs ttest2()

" ttest and ttest2 both give significantly differet results, how do you know which test to use?"

First, know what each test is testing by looking at the null hypotheses. Fortunately, Matlab has done a decent job at directly stating the null hypotheses in the documentation for each function.

For ttest(x,y)

h = ttest(x,y) returns a test decision for the null hypothesis that the data in 
x – y comes from a normal distribution with mean equal to zero and unknown 
variance, using the paired-sample t-test.

A "paired-sample t-test" means that x(n) is associated in some way with y(n). For example, x could be the test results of 100 students on day-1 of a course and y could be the test results of the same exact 100 students in the same order taking the same exam on the last day of a course. Another example: x could be duration of balancing on the left leg while y could be the duration of balancing on the right leg for 100 people with unilateral vestibular hypofunction. In both examples x(n) is associated with y(n).

For ttest2(x,y)

h = ttest2(x,y) returns a test decision for the null hypothesis that the data in 
vectors x and y comes from independent random samples from normal distributions 
with equal means and equal but unknown variances, using the two-sample t-test. 
The alternative hypothesis is that the data in x and y comes from populations
with unequal means. The result h is 1 if the test rejects the null hypothesis at  
the '5%' significance level, and 0 otherwise.

Here "independent random samples" is key. Unlike paired t-test, x(n) has no more of a relationship with y(n) than y(n+1) or y(n-1). For example, x could be the height of 100 fully grown maple trees at sea level while y could be the height of 100 fully grown maple trees at an elevation of 5000 feet. x(n) and y(n) are both maple trees but they are different trees and have no other relationship.

Going back to your storm data, it doesn't matter that your two vectors of data are the same length. What matters is whether the historic data are related to the future data. Since storms come and go (except on Jupiter) it's unlikely that the historicData(n) is related to the futurePrediction(n) so your samples are independent (but you must make that decision). That would point to using ttest2().

Assumptions and nonparametric tests

Lastly, don't ignore the assumptions. In both cases, it is expected that your data are normally distributed or at least close to normal. The t-test will give you a result under any distribution but it's up to you as a researcher to trust that result based on satisfying the assumptions.

[addendum]

Star Strider mentioned a good point that I'd like to expand upon for completeness. If the data does not meet the assumptions for a ttest (mainly, if it's distribution is not normal), you can use a nonparametric test that does not rely on underlying distributions (these tests still have other assumptions!).

For an independent sample ttest **with equal variances**,

Mann-Whitney U-Test (in matlab: ranksum(x,y))

For an independent sample ttest **with unequal variances**

Kolmogorov-smirnov test (in matlab: kstest2(x,y) for 2-sample and ktest(x) for 1 sample

For a paired ttest

Wilcoxon Signed-Rank test (in matlab: ranksum(x,y))

MATLAB: Dimensions error on paired t-test

Since you’re obviously not overshadowing, see if you have a path problem. The way to solve that is to run these two lines:

restoredefaultpath 
rehash toolboxcache

That should solve path problems. If this doesn’t work, contact MathWorks Technical Support.

Best Answer

Related Solutions

MATLAB: Hypothesis testing in matlab

MATLAB: Dimensions error on paired t-test

Related Question