I would like to ask – I am using logit to investigate, if some variables improve the risk of currency crises. I have yearly data from 1980 for lots of countries (unbalanced panel), dummy variable is 1 if currency crises started (according to my definition), 0 otherwise. Explanatory variables are according to some theories, like current account/GDP, Net foreign assets/GDP, loans/GDP and so on… All are lagged (-1). I am using robust standard errors, which should be consistent with heteroskedasticity. However, for example loans to GDP or NFA/GDP are not stationarity (panel test). Does this matter? I have not seen any paper testing for stationarity performing logit/probit. For me it is also intuitive that it does not matter. If I am testing if a variable increases the risk of a crisis, it should not be problem, that this variable is rising permanently. On the contrary – rising variable is permanently rising the risk of the crisis and when it reach to some unsustainable level, the crisis occurs. Please could you give me an answer, whether I am right?
Solved – Does non-stationarity in logit/probit matter
logisticprobitstationaritytime series
Related Solutions
I think a better way to see the marginal effect of a given variable, say $X_j$, is to produce a scatter plot of the predicted probability on the vertical axis, and to have $X_j$ on the horizontal axis. This is the most "layman" way I can think of indicating how influential a given variable is. No maths, just pictures. If you have a lot of data points, then a boxplot, or scatterplot smoother may help to see where most of the data is (as oppose to just a cloud of points).
Not sure how "Layman" the next section is, but you may find it useful.
If we look at the marginal effect, call it $m_j$, noting that $g(p)=\sum_kX_k\beta_k$, we get
$$m_j=\frac{\partial p}{\partial X_j}=\frac{\beta_j}{g'\left[g^{-1}(X^T\beta)\right]}=\frac{\beta_j}{g'(p)}$$
So the marginal effect depends on the estimated probability and the gradient of the link function in addition to the beta. The dividing by $g'(p)$, comes from the chain rule for differentiation, and the fact that $\frac{\partial g^{-1}(z)}{\partial z}=\frac{1}{g'\left[g^{-1}(z)\right]}$. This can be shown by differentiating both sides of the obviously true equation $z=g\left[g^{-1}(z)\right]$. We also have that $g^{-1}(X^T\beta)=p$ by definition. For a logit model, we have $g(p)=\log(p)-\log(1-p)\implies g'(p)=\frac{1}{p}+\frac{1}{1-p}=\frac{1}{p(1-p)}$, and the marginal effect is:
$$m_j^{logit}=\beta_jp(1-p)$$
What does this mean? well $p(1-p)$ is zero at $p=0$ and at $p=1$, and it reaches its maximum value of $0.25$ at $p=0.5$. So the marginal effect is greatest when the probability is near $0.5$, and smallest when $p$ is near $0$ or near $1$. However, $p(1-p)$ still depends on $X_j$, so the marginal effects are complicated. In fact, because it depends on $p$, you will get a different marginal effect for different $X_k,\;k\neq j$ values. Possibly one good reason to just do that simple scatter plot - don't need to chose which values of the covariates to use.
For a probit model, we have $g(p)=\Phi^{-1}(p)\implies g'(p)=\frac{1}{\phi\left[\Phi^{-1}(p)\right]}$ where $\Phi(.)$ is standard normal CDF, and $\phi(.)$ is standard normal pdf. So we get:
$$m_j^{probit}=\beta_j\phi\left[\Phi^{-1}(p)\right]$$
Note that this has most of the properties that the $m_j^{logit}$ marginal effect I discussed earlier, and is equally true of any link function which is symmetric about $0.5$ (and sane, of course, e.g. $g(p)=tan(\frac{\pi}{2}[2p-1])$). The dependence on $p$ is more complicated, but still has the general "hump" shape (highest point at $0.5$, lowest at $0$ and $1$). The link function will change the size of the maximum height (e.g. probit maximum is $\frac{1}{\sqrt{2\pi}}\approx 0.4$, logit is $0.25$), and how quickly the marginal effect is tapered towards zero.
First of all, OLS is an estimation technique, not a model. I will assume you have a linear regression model that you would like to estimate using OLS.
Regarding non-stationarity, it is not covered under the OLS assumptions, so OLS estimates will no longer be BLUE if your data are non-stationary. In short, you do not want that. Also, it does not make sense to have a stationary variable explained by a random walk, or vice versa. A stationary process will revert to its mean while an integrated process may wonder off and away, hence the two are no match for each other. This situation is known as an unbalanced regression. (Although having variables of different orders of integration in the same regression equation can make sense when there is cointegration.)
Regarding seasonality, it is also a form of non-stationarity and you should model it explicitly. When ignored, seasonality may result in undesirable outcomes and misinterpretations. For example, you may find a statistically significant relationship between two variables when the only common underlying relationship between them is seasonality; think about modelling how weather depends on ice-cream sales.
You should care about specifying the model correctly (or, more realistically, as well as you can) first and then choosing an estimation method. Perhaps your dependent variable is stationary while GDP is not stationary; then you normally cannot model how the first depends on the second; but perhaps it makes sense to ask how your dependent variable depends on changes in GDP (the first differences of DGP). Also, if you have seasonality, include some terms to account for it or adjust the data for seasonality before putting the variables into the model.
Also, keep in mind that you are working with a pretty small sample (40 observations). The asymptotic properties of your estimators may not be that relevant yet; there is little room for constructing a rich model. I'm not sure if you can do much about it, but that's a different topic.
Best Answer
Whatever model you are using, the fundamentals of econometrics theory should be checked and respected. Researchers strut about their use of very sophisticated models, but often –more or less voluntarily- they forgot about the fundamentals of econometrics; they hence become quite ridicolus. Econometrics is no more than estimating the mean and variance of your parameters, but if the mean, variance and covariance of your variables change over time, suitable devices and analysis must be performed. In my opinion, probit/logit models with non stationary data make no sense because you want to fit the right hand side of your equation (that is non stationary) into the lefthand side that is a binary variable. The structure of the time dynamics of your independent variables must be coherent with the dependent ones. If some of your regressors are non stationary, your are miss-specifying your relation; indeed it must be that the combination of your regressors must be stationary. So I believe that probably you have to do a two step regression. In the first one you find a stationary relation of your variables, then you put this relation into your probit/logit model and estimate only one parameter.
Obviously in the first step you must have at list two integrated variables (in the cointegration case) or at least two variables with the same type of trend trend. If this is not the case you have a problem of omitted variables.
The altertnative to all this is that you change the scope of your analysis and transform all your regressors into a stationary ones.