Solved – Taylor’s expansion on log likelihood

likelihoodmathematical-statistics

As far as I know, Taylors expansion works for fixed functions. I was wondering why it is justified to use it on the log likelihood. Even if we consider it as a function of only $\theta$, doesn't it have components that change as n increases (like $\sum X_i$ for example) ? Is it really always ok to say something like
\begin{align*}
\ell\left(\theta\right) & = \ell\left(\widehat{\theta}\right)+\frac{\partial\ell\left(\theta\right)}{\partial\theta}\Bigr|_{\theta=\widehat{\theta}}\left(\theta-\widehat{\theta}\right)+ o(|\widehat{\theta} – \theta|)
\\
\end{align*}
Please help me understand why and when we can do something like this. Thanks in advance!

Best Answer

If one includes the notational dependency on $n$: $$ \begin{align*} \ell_n\left(\theta\right) & = \ell_n\left(\widehat{\theta}_n\right)+\frac{\partial\ell_n\left(\theta\right)}{\partial\theta}\Bigr|_{\theta=\widehat{\theta}_n}\left(\theta-\widehat{\theta}_n\right)+ o_n(|\widehat{\theta}_n - \theta|^2) \\ \end{align*} $$ we see that the puzzling point is the $n$-dependency of the $o$.

A rigorous way to get an approximate with a $n$-independent $o$: $$ \begin{align*} \ell_n\left(\theta\right) & = \ell_n\left(\widehat{\theta}_n\right)+\frac{\partial\ell_n\left(\theta\right)}{\partial\theta}\Bigr|_{\theta=\widehat{\theta}_n}\left(\theta-\widehat{\theta}_n\right)+ o(|\widehat{\theta}_n - \theta|^2) \\ \end{align*} $$ is Taylor-Lagrange's inequality: if you are able to majorate $\ell_n'' \leq M$ uniformly in $n$ (on an appropriate interval) then you get the uniform $o$ by Taylor-Lagrange's inequality.