According to the documentation, this is how each method fits the model:

`CSS`

minimises the sum of squared residuals.

`ML`

maximises the log-likelihood function of the ARIMA model.

`CSS-ML`

mixes both methods: first, `CSS`

is run, the starting parameters for the optimization algorithm are set to zeros or to the values given in the optional argument `init`

; then, `ML`

is applied passing the `CSS`

parameter estimates as starting parameter values for the optimization algorithm.

In a model with lags of the dependent variable, a initial set of observations must be somehow defined in order to evaluate the sum of squares or the likelihood function. `CSS`

and `ML`

deal with this issue differently. `CSS`

sets the initial observations to zeros, while `ML`

uses the initial state vector returned by the Kalman filter. `ML`

is supposed to be more accurate since, instead of setting these initial observations to zeros, it uses the sample data in order to get an estimate of these values.

For some details about the initialisation of the Kalman filter you may see
this post or
this post.

Notice that one thing is the starting parameter values and another is the set of initial observations. The former can be specified through the argument `init`

, while the latter are specified internally by each method as described before. This distinction is important in order to not get confused when reading the documentation.

In your code, the following error is obtained for the ARIMA(10,1,2) model:

```
Error in optim(init[mask], armafn, method = optim.method, hessian = TRUE, :
non-finite finite-difference value [5]
```

Roughly, this means that the optimization algorithm failed to reach a result.
The error arises for `method="ML"`

(actually for the `ML`

step of the `CSS-ML`

method). A further insight shows that the initial parameter values obtained from `CSS`

are apparently not good enough. The following gives error:

```
arima(tsTrain, order=c(10,1,2), method="CSS-ML")
```

while using zeros (the default) as starting parameters the optimization algorithm converges to a solution:

```
arima(tsTrain, order=c(10,1,2), method="ML")
```

At first glance, this may seem strange, since the `CSS`

step is supposed to provide starting parameters relatively close to those that maximise the likelihood function. A look at the autocorrelations of the data `acf(tsTrain); pacf(tsTrain)`

suggests that the model ARIMA(10,1,2) is not plausible for the data, in fact this model is far from the model chosen by `forecast::auto.arima`

, ARMA(1,0,0)(2,0,0). This may explain why estimates from `CSS`

were not reliable as input for `ML`

.

## Best Answer

You can start by reviewing a very basic ( and largely presumptive ) tutorial here https://onlinecourses.science.psu.edu/stat510/node/75/ . Review How to include control variables in an Intervention analysis with ARIMA? where I contributed to the discussion. The issue is to develop the relationship in a robust manner where anomalies do not distort.