R – Using Standard Error Instead of Standard Deviation for t-Tests in R: Comprehensive Guide

econometricsrstandard deviationstandard errort-test

I have a question about t-tests. I am using the rnorm function in R (example below) to perform t-tests on the difference of two means. Traditionally, this works by telling R the sample sizes, means, and standard deviations, but a colleague wants me to plug in the standard error where you would normally put the standard deviation. Is there ever a statistical justification for this?

For reference, the code looks like this:

one <- c(rnorm(9710, mean = 156958.8, sd = 3679.691))

two <- c(rnorm(9710, mean = 141639, sd = 2975.4))

t.test(one, two, var.equal = FALSE)

My colleague wants me to put the standard error, instead of the standard deviation, where the code reads "sd = x". I can't find a case where this would be required, but my stats knowledge has some holes, and I want to ensure that I have a strong basis for questioning their reasoning.

Thank you for taking the time to look at my problem!

Best Answer

The t-test 'uses' the standard deviation to generate the t-value, but it does so by dividing it by the square root of the sample size. In other words it 'uses' the standard error.

If you supply a number that is the standard error to a t-test function that expects the standard deviation then the t-value it calculates will be inappropriately large and the p-value inappropriately small.

It seems to me that you (or your friend) are confused by the rnorm function, as that is the only place I can see "sd=" in your code. The rnorm function samples randomly from a pseudo-population with a normal distribution with the specified mean and standard deviation. If you supply a standard error value there then the sampled population will be more narrow than you intend.

Probably you should read a little about the sd and SEM. Try here: Difference between standard error and standard deviation