According to the wikipedia article the point-biserial correlation is just Pearson correlation where one variable is continuous but the other is dichotomous (e.g. Yes/No, Male/Female). However the article later introduces rank-biserial correlation, which is a correlation measure between a dichotomous variable and a ordinal/ranked variable:
$r_{rb}=2(M_1-M_0)/n$
where $M_1$ and $M_0$ are the mean ranks in the continuous/ordinal variable, in groups "1" and "0", respectively, and $n=n_1+n_0$ is the total sample size.
What is the difference? Is rank-biserial correlation related to Pearson correlation?
Best Answer
The Wikipedia formula of "rank-biserial correlation" that you show was introduced by Glass (1966) and it is not equivalent to usual Pearson $r$ when the latter is computed on ranks data (that is, $r$ which actually will be Spearman's $rho$).
Let define $Y$ to be the quantitative variable already turned into ranks; and $X$ be the dichotomous variable with groups coded 1 and 0 (total sample size $n=n_1+n_0$).
Knowing the formula of Pearson $r$ and observing the following equivalencies of our situation on ranks vs 1-0 dichotomy,
$\sum XY= \sum Y_{x=1}=R_1$ (Sum of ranks in group coded 1),
$\sum X = \sum X^2 = n_1$,
$\sum Y = n(n+1)/2$,
$\sum Y^2 = n(n+1)(2n+1)/6$,
substitute, and get Pearson $r$ (= Spearman $rho$) formula looking as:
$r= \frac{2R_1-n_1(n+1)}{\sqrt{n_1n_0(n^2-1)/3}}$.
Now do substitutions into Glass' "rank-biserial correlation", to obtain:
$r_{rb}= \frac{2R_1-n_1(n+1)}{n_1n_0}$.
You can see that their denominators are different. So, Glass's $r_{rb}$ correlation isn't true Pearson/Spearman correlation. (Point-biserial correlation is true Pearson correlation.)
I haven't read Glass' original paper or its reviews and hesitate to say what can be the reason behind the correlation and is there any advantage of it over the Pearson/Spearman correlation.