Hi Duarte,
It is true that the log-likelihood surface of GJR and EGARCH are not a globally smooth function like y = x^2.
The situation is somewhat similar to y = abs(sin(x)) * (x<=2).
Is this function differentiable? Well, not globally. At x=0 and x=2, the derivatives are not well defined. However, it is differentiable almost everywhere. At the maximum (say x=pi/2), the function is differentiable. As long as the starting value is very close to that point, a gradient-based optimizer should converge to that point, because the function is concave in a neighborhood around the maximum.
Actually in most of the econometric models estimated by MLE, the likelihood function is complicated and there is no guarantee of convergence. My suggestion is to try many starting values and refine-tune the optimization options, so as to increase the chance of getting a good estimator.
In most cases, our functionalities of GJR-GARCH and EGARCH work well and are useful for volatility forecasting.
As for the default choice of algorithm, 'SQP', it was chosen because it offers a nice blend of accuracy and runtime performance. There have been instances in which other algorithms, such as 'Interior-Point', give better results, but in the vast majority of cases various algorithms provide very similar answers provided the model chosen is a good description of the data generating process.
Best Answer