Your dependent variable is growth. For economic time-series data it is more likely that growth will be a stationary process. This means that it will have constant mean. The level data on the other hand is usually non-stationary. Since your model is linear regression you assume that the true data generating process is
$$Y_t=\alpha_0+X_{1t}\alpha_1+...+X_{kt}\alpha_k+u_t$$
where $u_t$ is a white noise, $Y_t$ is stationary and $X_{kt}$ are non-stationary. Now stationarity implies that
$$EY_t=const=\alpha_0+\alpha_1EX_{1t}+...+\alpha_kEX_{kt}$$.
Now $EX_{kt}$ are functions of time which for non-stationary processes changes with time. So you are implying that
$$\alpha_0+\alpha_1\mu_1(t)+...+\alpha_k\mu_k(t)=const$$
for some non-constant functions $\mu_k(t)$. This places quite severe restrictions on non-stationary processes $X_{kt}$. For example if we have only one independent variable this restriction becomes
$$\alpha_1\mu_1(t)=const-\alpha_0$$
so either $\mu_1$ is constant or $\alpha_1$ is zero. In first case this contradicts the presumption, that $X_1$ is non-stationary in the second case the regression model is of no use.
So in general this is why it is not a good idea to mix levels and growths in regression, unless you are really sure that they are all stationary.
Another problem with time-series regression that for certain class of non-stationarity processes the regression can be spurious. In this case you cannot trust the least squares estimates $\hat{\alpha}_k$, since in spurious regression case their distribution is not normal and does not tend to normal, so usual regression statistics do not apply. For example you can get that $\alpha_k$ is significantly non-zero, when it is actually zero. So before the regression it is always a good idea to test whether your variables are not integrated using some variant of Dickey-Fuller test. I strongly suspect that Dow Jones index is integrated process.
Now as others pointed out, heteroscedasticity in the independent regression variable is harmless. The problems can arise if regression errors are heteroscedastic. Then the least squares estimates will be consistent, but inefficient, also the standard errors should be adjusted, for hypothesis testing.
Take a look at McCullagh and Nelder (1989) Generalized Linear Models, 2nd ed, Section 2.5 (pp 40-43), on iteratively reweighted least squares.
Let $y$ be the 0/1 outcome and let $\eta = g(\mu)$ be the link function. You never calculate $g(y)$ directly, but work with an adjusted dependent variable
$$z = \hat{\eta}_0 + (y-\hat{\mu}_0) \left(\frac{d\eta}{d\mu}\right)_0$$
where $\hat{\eta}_0$ is the current estimate of the linear predictor, $X\hat{\beta}_0$, and $\hat{\mu}_0 = g^{-1}(\hat{\eta}_0)$. So that avoids the problem with $g(0)$ and $g(1)$ being $\pm \infty$.
For the logit link, $\eta = \ln[\mu / (1-\mu)]$, you'll find that $d\eta/d\mu = 1/[\mu(1-\mu)]$ and so you would have
$$z = \hat{\eta}_0 + \frac{y-\hat{\mu}_0}{\hat{\mu}_0 (1 - \hat{\mu}_0)}$$
You further calculate weights
$$w_0^{-1} = \left(\frac{d\eta}{d\mu}\right)^2_0 v_0$$
where $v_0 = V(\mu_0)$ come from the mean/variance relationship, which for the binary case would be $V(\mu) = \mu(1-\mu)$. For the logit link, since $d\eta/d\mu = 1/[\mu(1-\mu)]$, you end up with weights $w_0 = \mu_0(1-\mu_0)$.
A key concern is the starting points. You might look at the R source code to see what they do. I wrote down in a notebook to start with
$\tilde{\mu} = 1/4$ if $y = 0$ and $\tilde{\mu} = 3/4$ if $y=1$, but I didn't include a source.
To spell out the iterative algorithm a bit more, focusing on the logit link:
At the start you do the following:
- Start with initial "fitted" values, say $\hat{\mu}^{(0)}_i = $ 1/4 or 3/4 according to whether $y_i = $ 0 or 1
- Calculate $\hat{\eta}^{(0)}_i = \ln[\hat{\mu}^{(0)}_i/(1-\hat{\mu}^{(0)}_i)]$
- Calculate $z^{(0)}_i = \hat{\eta}^{(0)}_i + [y_i-\hat{\mu}^{(0)}_i]/[\hat{\mu}^{(0)}_i (1 - \hat{\mu}^{(0)}_i)]$
- Calculate the weights $w^{(0)}_i = \hat{\mu}^{(0)}_i (1-\hat{\mu}^{(0)}_i)$
- Regress the $z^{(0)}_i$ on $X$ using weights $w^{(0)}_i$, to get initial estimates $\hat{\beta}^{(0)}$
Then, at each iteration, you do the following:
- Calculate $\hat{\eta}^{(s)}_i = X \hat{\beta}^{(s-1)}$
- Calculate $\hat{\mu}^{(s)}_i = \exp(\hat{\eta}^{(s)}_i)/[1+\exp(\hat{\eta}^{(s)}_i)]$
- Calculate $z^{(s)}_i = \hat{\eta}^{(s)}_i + [y_i-\hat{\mu}^{(s)}_i]/[\hat{\mu}^{(s)}_i (1 - \hat{\mu}^{(s)}_i)]$
- Calculate the weights $w^{(s)}_i = \hat{\mu}^{(s)}_i (1-\hat{\mu}^{(s)}_i)$
- Regress the $z^{(s)}_i$ on $X$ using weights $w^{(s)}_i$, to get revised estimates $\hat{\beta}^{(s)}$
This is all just for regular logistic regression. For the local logistic regression version, there is some discussion in Chapter 4 of Loader (1999) Local regression and likelihood (but frankly, I didn't really follow it).
A Google search for "local logistic regression IRLS" revealed these notes from Patrick Breheny, which say (pg 8):
The weight given to an observation $i$ in a given iteration of
the IRLS algorithm is then a product of the weight coming
from the quadratic approximation to the likelihood and the
weight coming from the kernel ($w_i = w_{1i} w_{2i}$)
Best Answer
Consider not doing stepwise resgression, which is a good way to almost insure biased results:
Malek, M. H. and Coburn, D. E. B. J. W. (2007). On the inappropriateness of stepwise regression analysis for model building and testing. European Journal of Applied Physiology, 101(2):263–264.
Steyerberg, E. W., Eijkemans, M. J., and Habbema, J. D. F. (1999). Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. Journal of clinical epidemiology, 52(10):935–942.
Whittingham, M., Stephens, P., Bradbury, R., and Freckleton, R. (2006). Why do we still use stepwise modelling in ecology and behaviour? Journal of Animal Ecology, 75(5):1182–1189.