Lee and Lemieux (p. 31, 2009) suggest the researcher to present the graphs while doing Regression discontinuity design analysis (RDD). They suggest the following procedure:
"…for some bandwidth $h$, and for some number of bins $K_0$ and
$K_1$ to the left and right of the cutoff value, respectively, the
idea is to construct bins ($b_k$,$b_{k+1}$], for $k = 1, . . . ,K =
K_0$+$K_1$, where $b_k = c−(K_0−k+1) \cdot h.$"
c=cutoff point or threshold value of assignment variable
h=bandwidth or window width.
…then compare the mean outcomes just to the left and right of the cutoff point…"
..in all cases, we also show the fitted values from a quartic regression model estimated separately on each side of the cutoff point…(p. 34 of the same paper)
My question is how do we program that procedure in Stata
or R
for plotting the graphs of outcome variable against assignment variable (with confidence intervals) for the sharp RDD.. A sample example in Stata
is mentioned here and here (replace rd with rd_obs) and a sample example in R
is here. However, I think both of these didn't implement the step 1. Note, that both have the raw data along with the fitted lines in the plots.
Sample graph without confidence variable [Lee and Lemieux,2009]
Thank you in advance.
Best Answer
Is this much different from doing two local polynomials of degree 2, one for below the threshold and one for above with smooth at $K_i$ points? Here's an example with Stata:
Alternatively, you can just save the lpoly smoothed values and standard errors as variables instead of using
twoway
. Below $x$ is the bin, $s$ is the smoothed mean, $se$ is the standard error, and $ul$ and $ll$ are the upper and lower limits of the 95% Confidence Interval for the smoothed outcome.As you can see, the lines in the first plot are the same as in the second.