Causality – Why SATE Differs from Treatment Effect in Mean Differences

causalityexperiment-designinferenceself-studytreatment-effect

A question from Gelman – Regression & Other Stories. This one has me a bit stumped, I've reread sections of the chapter and I'm still not understanding why this is the case.

I think the answer is along the times of heterogeneity in the treatment effect for the sample obs. Some observations get the benefit of the 'full' treatment effect, whereas others do not. This is reflected in the distinct error terms that are generated for each equation for the data generating process. This difference in the treatment effect of the treatment causes the SATE to be lower than the difference in means…

Additionally, if I work out the heterogeneous treatment effect (HTE) and add it to the SATE, I get a number that is close to the treatment effect. Therefore I can say that my difference in the mean estimate is biased upwards, caused by an HTE bias…

The question is below with my code

'Simulating potential outcomes: In this exercise, you will simulate an intervention study with a pre-determined average treatment effect. The goal is for you to understand the potential outcome framework, and the properties of completely randomized experiments through simulation. The setting for our hypothetical study is a class in which students take two quizzes. After quiz 1 but before quiz 2, the instructor randomly assigns half the class to attend an extra tutoring session. The other half of the class does not receive any additional help. Consider the half of the class that receives tutoring as the treated group. The goal is to estimate the effect of the extra tutoring session on average test scores for the retake of quiz 1. Assume that the stable unit treatment value assumption is satisfied.
(a) Simulating all observed and potentially observed data (omniscient mode). For this section, you are omniscient and thus know the potential outcomes for everyone. Simulate a dataset consistent with the following assumptions.
i. The average treatment effect on all the students, 𝜏, equals 5.
ii. The population size, 𝑁, is 1000.
iii. Scores on quiz 1 approximately follow a normal distribution with a mean of 65 and a standard deviation of 3.
iv. The potential outcomes for quiz 2 should be linearly related to the pre-treatment quiz score.

In particular they should take the form,

𝑦0 = 𝛽0 + 𝛽1𝑥 + 0 + 𝜖0,

𝑦1 = 𝛽0 + 𝛽1𝑥 + 𝜏 + 𝜖1,

*where the intercept 𝛽0 = 10 and the slope 𝛽1 = 1.1. Draw the errors 𝜖0 and 𝜖1 independently from normal distributions with mean 0 and standard deviations 1.

(b) Calculating and interpreting average treatment effects (omniscient mode). Answer the following questions based on the data-generating process or using your simulated data.

i. What is your interpretation of 𝜏?

ii. Calculate the sample average treatment effect (SATE) for your simulated dataset.

iii. Why is SATE different from 𝜏?'*


n = 1000 
x = rnorm(n, 65, 3 )
z = sample(c(1,0), 1000, replace = TRUE)
t = ifelse(z == 1, 5, 0) 

y0 = 10 +1.1*x +rnorm(n, 0, 1)
y1 = 10 +1.1*x + t + rnorm(n, 0, 1)

data = data_frame(x,z,y0,y1, yi = ifelse(z == 1 , y1, y0))

#SATE
SATE = (data$y1 - data$y0) %>% mean()
#difference in means - ATE
ATE = mean(data[data$z == 1 ,]$yi - mean(data[data$z==0,]$yi))

n_t = nrow(data[data$z ==1, ])
n_c = nrow(data[data$z ==0, ])

ATT = mean(data[data$z == 1 , ]$y1) - mean(data[data$z == 1 , ]$y0)
ATU = mean(data[data$z == 0 , ]$y1) - mean(data[data$z == 0 , ]$y0)

HTE = (1 - n_t/(n_t+n_c))*(att-atu)


SATE
[1] 2.471057
> ATE
[1] 5.053033
> SATE + HTE
[1] 5.112988
```

Best Answer

You made an error in your simulation. The difference in means estimator is unbiased for the ATE. The error you made is in simulating t (i.e., $\tau$).

The question states "The average treatment effect on all the students, $\tau$, equals 5". You programmed it so that the ATT is 5 and the ATC is 0. You need to set t = 5 in your simulation. Then, the SATE and ATE both line up with $\tau$.

set.seed(18)
n = 1000 
x = rnorm(n, 65, 3 )
z = sample(c(1,0), 1000, replace = TRUE)
t = 5

y0 = 10 +1.1*x +rnorm(n, 0, 1)
y1 = 10 +1.1*x + t + rnorm(n, 0, 1)

data = data.frame(x,z,y0,y1, yi = ifelse(z == 1 , y1, y0))

#SATE
(SATE = mean(data$y1 - data$y0))
#> [1] 4.9505
#difference in means - ATE
(ATE = mean(data[data$z == 1 ,]$yi - mean(data[data$z==0,]$yi)))
#> [1] 4.836275

They differ only due to sampling error. The treatment effect is constant; there is no heterogeneous treatment effect.

Related Question