After some consideration, in my opinion, "lower boundary" will make more sense rather than lower limit.
For example, this is the data,
Class Frequency
1 1
2 1
3 1
4 1
Based on the data, using we can know that the median is 2.5, without calculation. If using the formula as mentioned above, $\frac{n}{2}$ will get 2, there for the class contains the median is class 2, then using $L_m$ is a lower boundary,
$median = 1.5 + \left[ \frac{2 -1}{1}\right] \times 1 = 2.5$
This doesn't make sense for using lower limit. If changing the class to
Class Frequency
1-2 1
3-4 1
5-6 1
7-8 1
Using the method above, we will get,
$median = 2.5 + \left[ \frac{2 -1}{1}\right] \times 2 = 4.5$
However, if using class limit, then we will get 5.
Here is an outline of what I intend to do:
(1) 'Reconstruct' the original data by using R to spread the
observations in each interval at random within the interval.
Here is a density histogram (intervals of equal length) of one such
reconstruction.
x = c(runif(32,0,20),runif(45,20,50),runif(15,50,70),runif(8,70,100))
hist(x, prob=T, col="wheat")
(2) Use a modern density estimator to 'smooth' this histogram,
and determine the location of the highest point of the density
estimator, which is a reasonable estimate of the mode of
the reconstructed data. For this reconstruction, the mode is 22.4.
hist(x, prob=T, col="wheat")
lines(density(x), col="blue")
dxy = density(x); dx = dxy$x; dy = dxy$y # (x,y) components of 'smooth'
dx[dy == max(dy)] # x-value at which 'smooth' has its max
## 22.36885 # estimated density
(3) Of course, each random reconstruction of the data will be
somewhat different. Repeat steps (1) and (2) 2000 times and
keep track of the 2000 modes produced. The median of these
estimated modes was 23.6. Take this value to be a reasonable
estimator of the mode of the distribution from which the
original data were sampled.
However, these estimated modes
where quite variable (mainly because so much information
was lost in the original summary of the data into four
groups of unequal lengths). Below is a boxplot of the
2000 mode estimates. (Note: The histogram and density-estimator
curve in the figure above happen to be for the last of
the 2000 reconstructions of the data in my simulation.)
I doubt that this is anything like the method you were
expected to use, but I believe this is a responsible
approach to solving the problem. (Certainly better than the
approaches I initially suggested in my Comment an hour ago.
Maybe I should delete the Comment now, but that seems like
cheating.)
Best Answer
The following is not a rigorous derivation (a derivation would require a lot of assumptions about what makes one estimator better than another), but is an attempt to "make sense" of the formula so that you can more easily remember and use it.
Consider a bar graph with a bar for each of the classes of data. Then $f_1$ is the height of the bar of the modal class, $f_0$ is the height of the bar on the left of it, and $f_2$ is the height of the bar on the right of it.
The quantity $f_1 - f_0$ measures how far the modal class's bar "sticks up" above the bar on its left. The quantity $f_1 - f_2$ measures how far the modal class's bar "sticks up" above the bar on its right.
Now, observe that $$ \frac{f_1 - f_0}{2f_1 - f_0 - f_2} + \frac{f_1 - f_2}{2f_1 - f_0 - f_2} = \frac{f_1 - f_0}{(f_1 - f_0) + (f_1 - f_2)} + \frac{f_1 - f_2}{(f_1 - f_0) + (f_1 - f_2)} = 1 $$ So if we want to divide an interval of width $h$ into two pieces, where the ratio of sizes of those two pieces is $(f_1 - f_0) : (f_1 - f_2)$, the first piece will have width $\frac{f_1 - f_0}{2f_1 - f_0 - f_2} h$.
This is what the formula for estimating the mode does. It splits the width of the modal bar into two pieces whose ratio of widths is $(f_1 - f_0) : (f_1 - f_2)$, and it says the mode is at the line separating those two pieces, that is, at a distance $\frac{f_1 - f_0}{2f_1 - f_0 - f_2} h$ from the left edge of that bar, $l$.
If $f_1 - f_0 = f_1 - f_2,$ that is, the modal bar is equally far above the bars on both its left and right, then this formula estimates the mode right in the middle of the modal class: $$ l + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} h = l + \frac12 h. $$ But if height of the bar on the left is closer to the modal bar's height, then the estimated mode is to the left of the centerline of the modal class. In the extreme case where the bar on the left is exactly the height of the modal bar, and both are taller than the bar on the right, that is, when $f_1 - f_0 = 0$ but $f_1 - f_2 > 0$, the formula estimates the mode at $l$ exactly, that is, at the left edge of the modal bar. In the other extreme case, where the bar on the left is shorter but the bar on the right is the same height as the modal bar ($f_1 - f_0 > 0$ but $f_1 - f_2 = 0$), the formula estimates the mode at $l + h$, that is, at the right edge of the modal bar.