Solved – Parameter estimates for the triangular distribution

estimationmaximum likelihoodmethod of momentsoptimizationtriangular distribution

A question was posted here (now deleted) in relation to estimating the parameters of the triangular distribution, which has density

$$f(x;a,b,c)=\begin{cases} \quad 0 & \text{for } x < a, \\ \frac{2(x-a)}{(b-a)(c-a)} & \text{for } a \le x \le c, \\ \frac{2(b-x)}{(b-a)(b-c)} & \text{for } c < x \le b, \\\quad 0 & \text{for } b < x. \end{cases}$$

But the question is worth asking, so I am asking it myself.

What are good ways to estimate the parameters for this distribution?

Discussion of MLE is good but other estimators can make for fruitful answers.


Note 1: Many documents related to PERT seem to use $X_{(1)}$ and $X_{(n)}$ to estimate $a$ and $b$ and then (given that) use method of moments for $c$. If you advocate this approach in particular, some discussion of efficiency would be most helpful, but at the least some reason for the choice (or one similar to it) would be important.


Note 2:

[Perhaps this should be the start of an answer but I will place it here as guidance on answers relating to ML for the present.]

Note that for MLE setting derivatives of log-likelihood to zero won't work.

For example for known $a$ and $b$ (which wlog we can take as 0,1 by simple rescaling), see the discussion on MLE for $c$ here: MLE for triangle distribution? .

Additionally, in general the ML estimates for the endpoints $a$ and $b$ are not the extreme order statistics. See, for example
here (1)

(1) Kotz, Samuel, and Johan Rene van Dorp (2004),
The Triangular Distribution, (Chapter 1)
Beyond Beta—Other Continuous families of Distributions with Bounded Support and Applications,
World Scientific, NJ
(sample chapter)

Best Answer

Using the extreme-order statistics as estimators for the boundaries $a,b$ and then using

$$E(X) = \frac {a+b+c}{3}$$

to estimate $c$ by method of moments is so ...maddeningly easy,

$$\hat a = X_{(1)},\;\; \hat b = X_{(n)},\;\;\hat c = 3\bar X - \hat a - \hat b$$ it made me think how I could start by estimating $c$ first, just for the twist of it. Here it is but not yet with any properties of the estimator. I will make this community wiki in case any one is interested in working it further.

1) Obtain the empirical quartiles $\hat q_1, \hat q_3$ and form the Interquartile Range $\text{IQR} = \hat q_3 - \hat q_1$

2) Use the Friedman-Diaconis rule to bin the data.

$$\text{Bin size}=2\, { \text{IQR} \over{ {n^{1/3}} }}$$

3) Form the empirical histogram and estimate $\hat c$ as the mid-point of the bin with the highest empirical frequency.

4) Then solve for $a,b$ the system of equations

$$q_1 = a + \frac {\sqrt{(c-a)(b-a)}}{2}$$ $$q_3 = b + \frac {\sqrt{(b-c)(b-a)}}{2}$$

using the estimated $\hat q_1, \hat q_3, \hat c$ (the inverse CDF expressions I took from the book chapter the OP links to, page 8).