Typically, a rank $r$ (between $1$ and $n$ for $n$ data) is converted to a percent $p$ via the formula
$$p = 100\frac{r-\alpha}{n+1-2\alpha}$$
for some predetermined "plotting position" $\alpha$ between $0$ and $1$, inclusive. Solving for $r$ in terms of $p$ gives
$$r = (n+1-2\alpha) (p/100) + \alpha.$$
Excel has historically used $\alpha=1$ for its PERCENTILE
and QUARTILE
functions. The online documentation for QUARTILE.INC
and QUARTILE.EXC
is useless, so we have to reverse-engineer what these functions are doing.
For example, with data $(1,2,3,4,5,6,7,8,9,10)$, we have $n=10$ and $p \in \{25, 50, 75\}$ for the three quartiles. Using $\alpha=1$ in the preceding formula yields ranks of $9(0.25)+1 = 3.25$, $9(0.50)+1 = 5.5$, and $9(0.75)+1 = 7.75$, reproducing the results for QUARTILE.INC
.
If instead we set $\alpha=0$ the corresponding ranks are $11(0.25) = 2.75$, $11(0.50) = 5.5$, and $11(0.75) = 8.25$, reproducing the results for QUARTILE.EXC
.
Further testing on your part (I do not have a recent version of Excel) may establish the validity of my guess that these two versions of the quartile function are determined by these two (extreme) values of $\alpha$.
By the way, fractional ranks are converted into data values by means of linear interpolation. The process is explained and illustrated in my course notes at Percentiles and EDF Plots--look near the bottom of that page. There is also a link to an Excel spreadsheet illustrating the calculations.
If you would like to implement a general percentile function in Excel, here's a VBA macro to do it:
'
' Converts a percent, computed using plotting position constant A,
' into a percent appropriate for the Excel Percentile() and
' Quartile() functions. (The default value of A for Excel is 1;
' most values in use are between 0 and 0.5.)
'
Public Function PercentileA(P As Double, N As Integer, A As Double) As Double
If N < 1 Or A < 0# Or A > 1# Or P < 0# Or P > 1# Then
Exit Function
End If
If N < 2 Then
PercentileA = 0.5
Else
PercentileA = ((N - 2 * A + 1) * P + A - 1) / (N - 1)
End If
End Function
It converts a nominal percent (such as 25/100) into the percent that would cause Excel's PERCENTILE
function to return the desired value. It is intended for use in cell formulas, as in =PERCENTILE(Data, PercentileA(0.25, Count(Data), 0.5))
.
Note: In the following answer I assume that you only know the quantiles you mentioned and you do not know anything else about the distribution, for instance you do not know whether the distribution is symmetric or what its pdf or its (centralized) moments are.
It is not possible to calculate the 4th quartile, if you have only the median and the IQR.
Let us look at the following definitions:
median = second quartile.
IQR = third quartile $-$ first quartile.
The 4th quartile is in neither of these two equations. Therefore, it is impossible to calculate it with the information given.
Here is one example:
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(1,2,3,4,5,6,7,8,9,20)
summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.25 5.50 5.50 7.75 10.00
summary(y)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.25 5.50 6.50 7.75 20.00
The first quartile is for both "x" and "y" 3,25. Also the median is 5.5 for both. The third quartile is 7.75 for both and the IQR is 7.75 $-$ 3.25 = 4.5 for both. However, the 4th quartile, which is also the maximum, is different, namely 10 and 20.
You can also look at boxplots of x and y and you will see that the first quartile, the second quartile (median) and the third quartile are equal. Therefore, you cannot conclude anything about the rest of the distribution of the datapoints.
df <- data.frame(x,y)
p <- ggplot(stack(df), aes(x = ind, y = values)) + geom_boxplot()
p

Best Answer
The inverse of $F$ exists: You have to use Naperian logarithm, i.e., $ln (e^a) = log_e (e^a) = a$.