It depends on whether you are interested in $r^2$, the sample correlation coefficient, or the $R^2$ multiple correlation coefficient, used to assess the performance of regressions.
Both $r^2$ and adjusted $r^2$ are negatively biased--that is, the sample values are slightly smaller than the corresponding population value--but the adjusted formula is somewhat less biased. In addition to the sample size, the amount of bias depends on the value, with $r^2$ near zero and one showing the least bias and those near 0.6-0.8 showing the most bias.
Table 1 of a paper by Zimmerman, Zumbo, and Williams (2003) illustrates the bias as a function of sample size and correlation value. Elsewhere in the paper, they show simulation data indicating that the Fisher and Olkin and Pratt adjusted $r^2$ reduce this bias considerably.
There is also a decent amount of work looking at "$R^2$ shrinkage", which is a related phenomena that comes up a lot in regression-related contexts, but has the opposite sign (it is positively-biased, and adjustments bring it back down). Yin and Fan (2001) have a fairly comprehensive comparison of methods for estimating it, and Page 3/205 has some citations to descriptions of the problem.
Finally, you should be aware that there are lots of methods for adjusting $r^2$/$R^2$ (in fact, there are even multiple ($\ge3$) versions of the Olkin and Pratt adjustment formula floating around, some of which correct for the number of parameters), so it might help to be more specific about whatever you have in mind
Suppose you run a regression of $Y$ on regressor matrix $X$ with error term $\varepsilon$, i.e.
\begin{align}
Y = X\beta + \varepsilon
\end{align}
where $Y$, $\beta$, and $\varepsilon$ are $n\times1$ vectors and $X$ is a $n\times p$ matrix. Using Ordinary Least Squares (OLS), you estimate $\beta$ as $\hat{\beta}$ and obtain $\hat{y} = X\hat{\beta}$. Denote $\bar{y} = n^{-1}\sum_{i=1}^ny_i$, i.e. $\bar{y}$ is the average value of the entries in $y$.
Define the Total Sum of Squares (TSS) as $TSS:=\sum_{i=1}^n (y_i - \bar{y})^2$. This is the total square variation of $y$ without explaining any of this variation using $X$. One can further define the Residual Sum of Square (RSS) as $RSS:=\sum_{i=1}^n (y_i - \hat{y_i})^2$ and the Explained Sum of Square (ESS) as $ESS:= \sum_{i=1}^n (\hat{y_i} - \bar{y})^2$. RSS is called this way because it gives the variation of $y$ after using the fitted value $\hat{y_i}$ (instead of the mean) as predictor. ESS is the remaining (unexplained) variation after fitting the model.
$R^2$ is defined as $R^2:=1-RSS/TSS$. In the special case of a linear regression (and only then!) does this definition coincide with taking the square of the (estimated) correlation coefficient $r$. Finally, to answer your question, it is easy to just consult the above formula for $R^2$: Because it holds that $TSS = RSS + ESS$, one can rewrite $R^2$ as $R^2 = ESS/TSS = \frac{ESS}{n}/\frac{TSS}{n}$. Crucially, note that $\frac{ESS}{n}$ would be the * unexplained variance* and $\frac{TSS}{n}$ the total variance. This way, $R^2$ can be thought of as indicating the amount of 'explained variance/variation' that $X$ has with respect to $Y$.
Best Answer
The coefficient of determination can be calculated in various ways which coincide for linear regression. (If that's not true, then something in the comparison is not the coefficient of determination.) Away from that case, it gets messier. Various analogues or alternatives are often (but not always) labelled pseudo-. Watch out also for adjusted relatives penalising for using several predictors.
I have found the paper of Zheng and Agresti (2000) to be helpful in this territory.
Zheng and Agresti (2000) discussed the correlation between the response and the fitted or predicted response as a general measure of predictive power for generalized linear models (GLMs). This measure has the advantages of referring to the original scale of measurement, of being applicable to all types of GLM and of being familiar to many users of statistics. Preferably, it should be used as a comparative measure for different models applied to the same data set, given that restrictions on values of the response may imply limitations on its value (see e.g. Cox and Wermuth, 1992).
For an arbitrary GLM, this correlation is invariant under a location-scale transformation and it is the positive square root of the average proportion of variance explained by the predictors. However, again for an arbitrary GLM, it need not equal the positive square root of other definitions of R-square (e.g. Hardin and Hilbe, 2001); and it need not be monotone increasing in the complexity of the predictors, although in practice that is common. The correlation is necessarily sensitive to outliers.
As the predicted is a function of the observed, the correlation calculated from a sample may be expected to be biased upwards. A jackknifed correlation is recommended as one alternative. Zheng and Agresti provide more discussion of this point, including other estimators and a bootstrap approach to providing confidence intervals for the correlation and to estimating the degree of overfitting.
Cox, D.R. and N. Wermuth. 1992. A comment on the coefficient of determination for binary responses. American Statistician 46: 1-4.
Hardin, J. and J. Hilbe. 2001 (and later editions). Generalized Linear Models and Extensions. College Station, TX: Stata Press.
Zheng, B. and A. Agresti. 2000. Summarizing the predictive power of a generalized linear model. Statistics in Medicine 19: 1771-1781.
Note The use of binary predictors in regression need not limit $R^2$ in linear regression. A simple example is the use of one continuous predictor and one binary predictor. As in principle the data could all lie on two straight lines, a value of 1 is achievable.