Even for logistic regression with a dichotomous DV, there is no exact equivalent of $R^2$ (proportion of variance explained) nor any consensus on which approximation is best. Here is Paul Allison's explanation. However, the version Allison likes best is that of Tjur:
But there’s another $R^2$, recently proposed by Tjur (2009), that I’m
inclined to prefer over McFadden’s. It has a lot of intuitive appeal,
its upper bound is 1.0, and it’s closely related to $R^2$ definitions for
linear models. It’s also easy to calculate.
The definition is very simple. For each of the two categories of the
dependent variable, calculate the mean of the predicted probabilities
of an event. Then, take the difference between those two means. That’s
it!
Unfortunately, with more than two categories, there will be more than one difference; perhaps, however, some statistic based on this could be used. However, Allison is not optimistic about this:
Another potential complaint is that the Tjur $R^2$ cannot be easily
generalized to ordinal or nominal logistic regression. For McFadden
and Cox-Snell, the generalization is straightforward.
Of those two, Allison now prefers McFadden:
Here are the details. Logistic regression is, of course, estimated by
maximizing the likelihood function. Let $L_0$ be the value of the
likelihood function for a model with no predictors, and let $L_M$ be the
likelihood for the model being estimated. McFadden’s $R^2$ is defined as
$R^2_{McF} = 1\ –\ \ln(L_M) / \ln(L_0)$
I don't think the reason for GEE not to be popular is "software availability/implementation". Almost all software I am aware of and have been working for a while (STATA, R, SPSS) can do GEE. I wonder if there is any software that can do mixed and never(or difficult to do) GEE.
As explained here "Mixed Model Versus GEE estimates and which to use" in a little bit more detail, I don't think it is neither the interpretation that matters.
Even some authors (Hubbard AE et al, 2010) argue mixed model suffers from " unverifiable assumptions on the data-generating distribution".
Literature is full of the idea that "GEE allows robust inference even if a choice of correlation model is wrong or misspecified" and in case of mixed model "SE not robust to model misspecification".
I personally think GEE is computationally exhaustive than mixed; whether in hand or in software, if you do it nice and clean; that may be.
Best Answer
Five years late, but yes (kind of). Zheng proposed two $R^2$ analogues for GEE in 2000 (citation at bottom of answer).
Option 1
For your ordinal logistic model, assume that there is an underlying continuous latent variable that, when thresholds are applied, results in your observed ordinal $Y$. (Also assume that your software allows you to access that latent variable.)
Run a second GEE predicting that latent variable with the same predictors used in the ordinal model. From there, you can use Zheng's marginal $R^2$:
$R_{marginal}^2 = 1- \frac{\sum_{c=1}^C \sum_{i=1}^N (Y_{ic} - \widehat {Y_{it}})^2} {\sum_{c=1}^C \sum_{i=1}^N (Y_{ic} - \bar Y)^2} $
where the numerator is the sum of the squares of the Y (your latent variable) minus the fitted values from this second GEE across each cluster ( $c_1, c_2, ... c_C$ ) and each observation ($i_1, i_2, ... i_N$ ), and the denominator is the sum of the squares of the Y (your latent variable) minus the marginal mean of that Y.
Option 2
Ignore the ordered nature of your outcome variable and use Zheng's $H_{marginal}$ as a measure of "proportional reduction in entropy due to the model" where your model becomes a multinomial logistic model. $H_{marginal}$ is defined as
$H_{marginal} = 1 - \frac{\sum_{c=1}^C \sum_{i=1}^N \sum_{k=1}^K \hat \pi_{cik} log(\hat \pi_{cik}) } { nT\sum_{k=1}^K \hat \alpha_k log(\hat \alpha_k) } $
where $ \pi_{ck} = P( Y_c = k | X) $ is the "model-based probability that a categorical response [in cluster $c$] equals $k$", $\alpha_k = P(Y = k)$ is "the marginal probability of response $k$", and hats (^) indicate estimates.
Note that for both $R_{marginal}$ and $H_{marginal}$, you can obtain a "negative value when there is greater uncertainty in prediction under the model of than under the null model".
Zheng, B. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19(10), 1265-1275. doi: 10.1002/(SICI)1097-0258(20000530)19:10<1265::AID-SIM486>3.0.CO;2-U