Typically, a rank $r$ (between $1$ and $n$ for $n$ data) is converted to a percent $p$ via the formula
$$p = 100\frac{r-\alpha}{n+1-2\alpha}$$
for some predetermined "plotting position" $\alpha$ between $0$ and $1$, inclusive. Solving for $r$ in terms of $p$ gives
$$r = (n+1-2\alpha) (p/100) + \alpha.$$
Excel has historically used $\alpha=1$ for its PERCENTILE
and QUARTILE
functions. The online documentation for QUARTILE.INC
and QUARTILE.EXC
is useless, so we have to reverse-engineer what these functions are doing.
For example, with data $(1,2,3,4,5,6,7,8,9,10)$, we have $n=10$ and $p \in \{25, 50, 75\}$ for the three quartiles. Using $\alpha=1$ in the preceding formula yields ranks of $9(0.25)+1 = 3.25$, $9(0.50)+1 = 5.5$, and $9(0.75)+1 = 7.75$, reproducing the results for QUARTILE.INC
.
If instead we set $\alpha=0$ the corresponding ranks are $11(0.25) = 2.75$, $11(0.50) = 5.5$, and $11(0.75) = 8.25$, reproducing the results for QUARTILE.EXC
.
Further testing on your part (I do not have a recent version of Excel) may establish the validity of my guess that these two versions of the quartile function are determined by these two (extreme) values of $\alpha$.
By the way, fractional ranks are converted into data values by means of linear interpolation. The process is explained and illustrated in my course notes at Percentiles and EDF Plots--look near the bottom of that page. There is also a link to an Excel spreadsheet illustrating the calculations.
If you would like to implement a general percentile function in Excel, here's a VBA macro to do it:
'
' Converts a percent, computed using plotting position constant A,
' into a percent appropriate for the Excel Percentile() and
' Quartile() functions. (The default value of A for Excel is 1;
' most values in use are between 0 and 0.5.)
'
Public Function PercentileA(P As Double, N As Integer, A As Double) As Double
If N < 1 Or A < 0# Or A > 1# Or P < 0# Or P > 1# Then
Exit Function
End If
If N < 2 Then
PercentileA = 0.5
Else
PercentileA = ((N - 2 * A + 1) * P + A - 1) / (N - 1)
End If
End Function
It converts a nominal percent (such as 25/100) into the percent that would cause Excel's PERCENTILE
function to return the desired value. It is intended for use in cell formulas, as in =PERCENTILE(Data, PercentileA(0.25, Count(Data), 0.5))
.
Best Answer
Excel supports matrix operations.
In this case, do the following:
Put the data points in an $n$ by $p$ array where $p$ is the dimensionality of the space. Call this array
X
.Put the cluster centers in an $m$ by $p$ array and call it
M
.Put the weights into a $1$ by $p$ array and call it
W
.Create a range for the $n$ by $m$ calculation. Bound it on the left with the sequence $1,2,\ldots, n$, going down the column. To be concrete, let's suppose this sequence is in cells
A2
,A3
, etc. Bound it above with the sequence $1,2,\ldots, m$. To be concrete, let's suppose this is in cellsB1
,C1
, etc. Thus the upper left corner of the results will in cellB2
.Select the top cell in the result array (
B2
). In the formula bar typeand press
Enter
. Drag this formula throughout the entire array, first to the right across all $m$ cells of the top row, and then after selecting the entire top row, down to include all $n$ rows. Judicious use of "\$" in the formula causes it to update appropriately when dragged. (This illustrates how to compute an outer product in Excel.)This formula does the following:
OFFSET(X, $A2-1, 0, 1)
uses the entries in the left column (columnA
) to index into the rows of arrayX
.OFFSET(M, B$1-1, 0, 1)
uses the entries in the top row (row1
) to index into the rows of arrayM
.-
subtracts the designated row ofM
from the designated row ofX
, yielding a $1$ by $p$ array.TRANSPOSE
converts that result to a $p$ by $1$ array.MMULT
performs the matrix multiplication of the $1$ by $p$ arrayW
by the $p$ by $1$ array computed in the preceding step, producing a $1$ by $1$ array: that is, a number (the distance).