Solved – No Difference for t-test with Standardized Values

data transformationhypothesis testingspssstandardizationt-test

I have a dataset where I am looking at two different flowers with different heights. I have standardized the height and are doing a t-test to see if there is a difference in means for the two flowers in terms of heights.

When I do the t-test for the standardized heights (as opposed to the original height in the data) I still get the same t-statistic and p-value. I'm pretty sure this is what should be happening but would like an explanation as to why this occurs.

Best Answer

This is what should be happening. When you standardize (z-score) the flower heights, you are just doing a linear transformation of the data.

Imagine I had Group 1 with scores 8, 10, and 12 and Group 2 with scores 4, 6, and 8. The means for Group 1 and Group 2 are 10 and 6, respectively. So the mean difference you would be testing with a t-test is 4 (i.e., 10 - 6).

But what would happen if you subtracted 4 from every score? Group 1 and Group 2 means are now 6 and 2, respectively, but the mean difference you would be testing is the same: 4. So the t-test would show the exact same result.

Now, when you z-score (standardize) a variable, all you are doing is taking the raw score, subtracting the mean of the raw scores from it, and then dividing by the standard deviation of the raw scores. Just like above, the mean difference is preserved. You know you are doing a linear transformation because the correlation between raw and standardized scores is 1. Here's some R code showing that:

set.seed(1838) # Setting seed for replicability
raw <- rnorm(100,mean=10,sd=4) # Creating raw scores
z <- scale(raw) # Standardizing scores
cor(raw,z) # Looking at correlation between the two
plot(raw,z) # Plotting values

The correlation is 1, and the plot looks like this:

Related Solutions

Solved – Composite Scores and Standardized Composite Scores t test

I believe it should be the same.

Short answer: z-score standardization is a linear transformation and as such won't change the ratio that's the basis of the T-test.

Long: The basic formula for the independent two-sample T-test is: $$ t = \frac{\bar{X}_{1} - \bar{X}_{2}}{s_{p}\times\sqrt{\frac{2}{n}}} $$

If you did the z-score standardization, but have not changed the data otherwise. It is obvious that $\sqrt{\frac{2}{n}}$ is unchanged. So we just need to make sure that the ratio between the numerator and the denominator is unchanged too. Let's start with the denominator. The pooled standard deviation $s_p$ is:

$$ s_{p} = \sqrt{\frac{1}{2}\times(\sigma_{x_1}^{2}+\sigma_{x_2}^{2})} $$

Where $\sigma$ is the variance of the group:

$$ \sigma_{x_{1}}^{2} = \frac{\sum_{i=1}^{n}{(x_{i}-\bar{X_{1}})^{2}}}{{n-1}} $$

It can be assumed again that $n$ haven't changed. How much the sum part have changed due to standardization? For that let's look at the z-score formula: $$ z = x-\bar{x} \times \frac{n-1}{\sum{x-\bar{x}}} $$ That's a transformation that we apply to every element in our initial dataset. The critical parts are $x_i - \bar{x}$ from here and $\bar{X_1} - \bar{X_2}$ from the t-stat formula, as $n$ is unchanged. What we need to make sure essentially - to prove that the t statistics is the same - that these expressions have the same ratio in the initial and the z-score case. This can be proven by showing that the ratio of the mean and the individual values (the relative distance from the mean) is unchanged after the z-score transformation. Essentially:

$$ \frac{x_{i}}{\bar{x}} = \frac{z_{i}}{\bar{z}} $$

and this equation holds (see proof below) - the z-score doesn't change the relative distance between the values and the mean, actually it shows the distance from the mean in $\sigma$ units. Even if the actual values of the $\sigma$s will change their relative position to each other won't. That's kind of the point of the standardization - keep the distances, but lose the original level.

So back to the original t-statistic: $$ t = \frac{\bar{X}_{1} - \bar{X}_{2}}{s_{p}\times\sqrt{\frac{2}{n}}} $$ As individual values keep a relative distance from the mean, $\bar{X_1} - \bar{X_2}$ will be different from $\bar{Z_1} - \bar{Z_2}$, but as we've changed the pooled standard deviation (because of $x_i - \bar{X_1}$) with the same scale once we move into calculating relative measures we end up with the same results.

Proof: $$ \frac{x_{i}}{\frac{\sum{x_{i}}}{n}} = \frac{ \frac{x_{i}-\bar{x}}{\sigma} }{\frac{\sum{\frac{x_{i}-\bar{x}}{\sigma}}}{n}} $$ $$ \sum{\frac{x_i-\bar{x}}{\sigma}} \times \frac{x_i}{\sum{x_i}} = \frac{x_i-\bar{x}}{\sigma} $$ $$ \frac{x_{i}}{\sigma}\times\sum{x_i-\frac{x_i}{\sum{x_i}}}\times\frac{1}{\sum{x_i}} = \frac{x_{i}}{\sigma}(1-\frac{1}{\sum{x_i}}) $$ $$ \frac{\sum{x_i-\frac{x_i}{\sum{x_i}}}}{x_i} = 1 - \frac{1}{\sum{x_i}} $$ where simplifying the LHS leaves us with $$ 1 - \frac{1}{\sum{x_i}} = 1 - \frac{1}{\sum{x_i}} $$ Thus proving that: $$ \frac{x_{i}}{\bar{x}} = \frac{z_{i}}{\bar{z}} $$

Solved – Paired difference t-test vs independent two sample t-test to assess means difference

To readers : please note the hierarchy of the answer :-)

Suppose $X\sim N(\mu_x,\sigma_x^2)$ and $Y\sim N(\mu_y,\sigma_y^2)$. For simplicity, suppose $\sigma_x^2=\sigma_y^2=\sigma^2$, which is unknown. Suppose the two samples are $\mathbb{X}=\{X_1,\dots,X_m\}$ and $\mathbb{Y}=\{Y_1,\dots,Y_n\}$. We are testing $H_0: \mu_x-\mu_y=0$

when $m\neq n$
- paired t-test is not applicable.
- 2-sample t-test is applicable if $\mathbb{X}$ and $\mathbb{Y}$ are independent.
when $m=n$
1. if $\mathbb{X}$ and $\mathbb{Y}$ are matched and thus not independent, then
  - paired t-test is applicable
  - 2-sample t-test is not applicable as it assumes independence
2. if $\mathbb{X}$ and $\mathbb{Y}$ are independent, then
  - both paired t-test and 2-sample t-test are applicable.
  - paired t-test
    - Let $Z_i=X_i-Y_i$, where $i=1,2,\dots,n$, and $X_i$ and $Y_i$ are $i$-th observations in $\mathbb{X}$ and $\mathbb{Y}$ respectively. Then test statistic is $t_1=\frac{\bar{Z}}{S_z/\sqrt{n}} = \frac{\bar{X}-\bar{Y}}{S_z/\sqrt{n}}$, where $S_z$ is the sample standard deviation of $Z_i$'s. When $H_0$ is true, $t_1 \sim t(n-1)$. Note that the value of $Z_i$'s depends on the ordering of observations in $\mathbb{X}$ and $\mathbb{Y}$, so does $S_z$. Since $X_i$ and $Y_i$ are not paired, we can arbitrarily re-order the observations in each of $\mathbb{X}$ and $\mathbb{Y}$, and get different values of $Z_i$'s, $S_z$ and thus $t_1$. This is the most obvious disadvantage of applying paired test to unpaired data. To make the test "objective", let's order the observations in $\mathbb{X}$ and $\mathbb{Y}$ in a completely random manner.
  - 2-sample t-test
    - test statistic is $t_2= \frac{\bar{X}-\bar{Y}}{\sqrt{S_x^2+S_y^2}/\sqrt{n}}$. When $H_0$ is true, $t_2 \sim t(2n-2)$.
  - What is the difference between $t_1$ and $t_2$?
    - Theoretically, both tests work, but they have different derivations. $t_1$ is based on one normal random variable $\bar{Z}$ and one $\chi^2_{n-1}$ random variable $(n-1)S_z^2$ which is independent of $\bar{Z}$. $t_2$ is based on difference between two independent normal variables $\bar{X}$ and $\bar{Y}$, and addition of two independent $\chi^2_{n-1}$ random variables $(n-1)S_x^2$ and $(n-1)S_y^2$ which are both independent of $\bar{X}$ and $\bar{Y}$.
    - What is the relationship between $S_x^2$, $S_y^2$ and $S_z^2$? $S_z^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n}(Z_i-\bar{Z})^2$ $ = \frac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-Y_i-\bar{X}+\bar{Y})^2$ $ = \frac{1}{n-1}\sum\limits_{i=1}^{n}[(X_i-\bar{X})-(Y_i-\bar{Y})]^2$ $ = \frac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\bar{X})^2 + \frac{1}{n-1}\sum\limits_{i=1}^{n}(Y_i-\bar{Y})^2 - 2\frac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\bar{X})(Y_i-\bar{Y})$ $ = S_x^2 + S_y^2 - 2S_{xy}$, where $S_{xy}$ is the sample covariance between $\mathbb{X}$ and $\mathbb{Y}$, which again depends on the ordering of observations in $\mathbb{X}$ and $\mathbb{Y}$.
  - Which test should we choose?
    - key word : statistical power of test, which is the probability for a test to reject $H_0$ when $H_0$ is false.
    - Situation 1: If $S_{xy} = 0$, then $S_z^2 = S_x^2 + S_y^2$ and $|t_1| = |t_2|$. Since $t_2$ has larger degree of freedom, the corresponding p-value is smaller than $t_1$. Hence, when $H_0$ is false, 2-sample t-test has larger statistical power (making you more confident in rejecting $H_0$) than paired t-test. Since $X$ and $Y$ are independent, we know their population covariance is $\sigma_{xy} = 0$. When the observations in $\mathbb{X}$ and $\mathbb{Y}$ are randomly ordered, intuitively, we have a good reason to believe that the sample covariance $S_{xy}$ is not far from the population covariance $\sigma_{xy}=0$, and thus believe $S_{xy} \approx 0$ is the most likely situation.
    - Situation 2: If $S_{xy} < 0$ (we can always achieve this by cheating: order the observations in $\mathbb{X}$ in ascending order and $\mathbb{Y}$ in descending order, which is the reason why we should use random ordering), then $S_z^2 > S_x^2 + S_y^2$ and $|t_1| < |t_2|$. Since $t_2$ also has larger degree of freedom than $t_1$, the 2-sample t-test has smaller p-value than paired t-test. When $H_0$ is false, 2-sample t-test has larger statistical power.
    - Situation 3: If $S_{xy} > 0$ (we can always achieve this by cheating: order the observations in both $\mathbb{X}$ and $\mathbb{Y}$ in ascending order, which is the reason why we should use random ordering), then $S_z^2 < S_x^2 + S_y^2$ and $|t_1| > |t_2|$. Now $|t_1| > |t_2|$ but $t_2$ has more degrees of freedom than $t_1$, so it is hard to say which test statistic gives larger p-value or which test should be preferred.
    - Conclusion: Considering all 3 situations, overall, when $X$ and $Y$ are independent, it is better to choose $t_2$ (two-sample t-test) rather than $t_1$ (paired t-test) if we use statistical power as the criterion. Especially, since $S_{xy} \approx 0$ is the most likely situation, we simply choose the test having larger power in situation 1, which is two-sample t-test.
    - A simulation study is conducted to compare the two test procedures in term of statistical power. Let $X\sim N(0, 1)$, $y\sim N(\mu_y, 1)$, let $n=10$ and let $\alpha = 0.05$. The x-axis represents the true value of $\mu_y$ ($H_0: \mu_x = \mu_y$ is true when $\mu_y = 0$), and the y-axis shows the percentages of rejecting $H_0$ using the 2 test procedures. As you can see, when $H_0$ is true ($\mu_y=0$), the two procedures have almost the same type I error; when $H_0$ is false, 2-sample t-test has more chance of rejecting $H_0$.
```
n.rep <- 10000
n <- 10
alpha <- 0.05
mu.y.seq <- seq(0, 2, 0.1)
len.mu.y.seq <- length(mu.y.seq)
pct.rej.mat <- matrix(0, nrow = len.mu.y.seq, ncol = 2)
for(mu.y.index in 1:len.mu.y.seq){
  n.rej.1 <- 0
  n.rej.2 <- 0
  for(rep in 1:n.rep){
    set.seed(rep)
    X <- rnorm(n = n, mean = 0)
    Y <- rnorm(n = n, mean = mu.y.seq[mu.y.index])
    Z <- X - Y
    t1 <- t.test(x = Z)
    t2 <- t.test(x = X, y = Y)
    if(t1$p.value < alpha) n.rej.1 <- n.rej.1 + 1
if(t2$p.value < alpha) n.rej.2 <- n.rej.2 + 1

  }
  pct.rej.mat[mu.y.index, ] <- c(n.rej.1, n.rej.2)/n.rep
}
plot(pct.rej.mat[ ,1] ~ mu.y.seq, ylim = c(0, 1), xlab = expression(paste(mu[y])), ylab = "percentage of rejecting H0", main = "comparing paired t-test with 2-sample t-test for independent X and Y")
lines(pct.rej.mat[ ,1] ~ mu.y.seq)
points(pct.rej.mat[ ,2] ~ mu.y.seq, col = "red")
lines(pct.rej.mat[ ,2] ~ mu.y.seq, col = "red")
legend(x = c(0,0.5), y = c(0.84, 1), legend = c("paired t-test","2-sample t-test"), col = c("black", "red"), pch = c(1,1) )
```

Best Answer

Related Solutions

Solved – Composite Scores and Standardized Composite Scores t test

Solved – Paired difference t-test vs independent two sample t-test to assess means difference

Related Question