Solved – Quantile Regression vs OLS for homoscedasticity

least squaresquantile regressionregression

I have a question on the slope coefficient of OLS compared to that for Quantile Regression, when facing homoscedastic error terms. The population model may look like:

$y_i = \beta_0 + \beta_{1}x_i + u_i$

with $u_i$ being iid error terms. Will the estimated slope coefficient $\hat{\beta}_{1}$ converge to the same value $\beta_{1}$ for OLS and for QR for different quantiles? While the sample estimates $\hat{\beta}_{1}$ may well be different from one another.

Considering the convergence of QR estimators, I know that in the presence of homoscedasticity all the slope parameters for different quantile regressions will converge to the same value (as shown by Koenker 2005: 12). But I am just not sure how the convergence of the OLS coefficient $\beta_{1}$ will compare to that of the median QR (the LAD) coefficient $\beta_{1}(0.5)$ for example. Is there a proof that both will converge to the same value? My intuition tells me this should be the case.

The answer is probably in the loss functions for OLS and QR. OLS minimizes squared residuals, while QR (for the median) minimizes absolute deviations. Therefore, as errors are squared, OLS puts more weight on outliers as opposed to QR. But in the case of homoscedasticity, shouldn't outliers cancel each other out because positive errors are as likely as negative ones, rendering OLS and median QR slope coefficient equivalent (at least in terms of convergence)?

Update
In order to test the prediction that for homoscedasticity the slope coefficients for different quantiles are equivalent, I ran a test in stata. This is done only to confirm the result of Koenker (2005) mentioned earlier. The original question is regarding the convergence of OLS as compared to QR. I created n=2000 observations with Stata via:

set obs 2000  
set seed 98034  
generate u = rnormal(0,8)  
generate x = runiform(0,50)
generate y = 1 + x + u

For this sample I performed a QR regression for the quantiles (0.10, 0.50, 0.90) and then tested the joint hypothesis that the slope coefficient for the three quantiles is identical, i.e.:

$H_0: \beta_1(0.1)=\beta_1(0.5)=\beta_1(0.9)$

This is the corresponding stata code:

sqreg y x, quantile(.1, .5, .9) reps(400)
test [q10=q50=q90]: x

The evidence was overwhelmingly, the H0 could very strongly not be rejected. Output for the Wald test:

F(  2,  1998) =    0.79
Prob > F =    0.4524

This reaffirmed my thoughts, but it does not provide any theoretical guidance on whether this should always be expected.

Best Answer

Will the estimated slope coefficient $\beta_1$ always be the same for OLS and for QR for different quantiles?

No, of course not, because the empirical loss function being minimized differs in these different cases (OLS vs. QR for different quantiles).

I am well aware that in the presence of homoscedasticity all the slope parameters for different quantile regressions will be the same and that the QR models will differ only in the intercept.

No, not in finite samples. Here is an example taken from the help files of the quantreg package in R:

    library(quantreg)
    data(stackloss)
    rq(stack.loss ~ stack.x,tau=0.50) #median (l1) regression fit 
                                      # for the stackloss data.
    rq(stack.loss ~ stack.x,tau=0.25) #the 1st quartile

However, asymptotically they will all converge to the same true value.

But in the case of homoscedasticity, shouldn't outliers cancel each other out because positive errors are as likely as negative ones, rendering OLS and median QR slope coefficient equivalent?

No. First, perfect symmetry of errors is not guaranteed in any finite sample. Second, minimizing the sum of squares vs. absolute values will in general lead to different values even for symmetric errors.

Best Answer

Related Solutions

Conditional vs Unconditional Quantile Regression

Ridge Regression – Assumptions and How to Test Them

What is an assumption of a statistical procedure?

Assumptions of penalized regression techniques

But what about the mathematical result that ridge regression always beats OLS?

Okay, but how do I know if I can apply ridge regression or not?

Related Question