Solved – Log-likelihood of multivariate Poisson distribution

multivariate analysispoisson distribution

I'm having difficulty getting the gradient of the log-likelihood of a multivariate Poisson distribution. Here's how I have it setup:

  1. A finite set of finite-dimensional vectors $T$ with elements $\mathbf{t}$
  2. $d$ functions $\left\{f_1,f_2,\dotsc,f_d\right\}$ with compact support.
  3. The parameter of the multivariate Poisson is given by $\lambda_{\mathbf{t}}\left(\boldsymbol{\theta}\right) = \sum_{k=1}^{d}\theta_k f_k\left(\mathbf{t}\right)$.
  4. A sample from this distribution looks like this: $y_\mathbf{t}\sim\textrm{ Poisson}\left(\exp\left(\lambda_{\mathbf{t}}\left(\boldsymbol{\theta}\right)\right)\right)$
  5. Multivariate Poisson likelihood function: $$L\left(\boldsymbol\theta\right)=\prod_{\mathbf{t}\in T}\frac{\exp\left(-\lambda_{\mathbf{t}}\left(\boldsymbol{\theta}\right)\right)\left(\lambda_{\mathbf{t}}\left(\boldsymbol{\theta}\right)\right)^{y_\mathbf{t}}}{y_\mathbf{t}!}$$

Here's where I am:
\begin{align*}
l\left(\boldsymbol\theta\right)&=\sum_{\mathbf{t}\in T}\log\frac{\exp\left(-\lambda_{\mathbf{t}}\left(\boldsymbol{\theta}\right)\right)\left(\lambda_{\mathbf{t}}\left(\boldsymbol\theta\right)\right)^{y_{\mathbf{t}}}}{y_{\mathbf{t}}!}\\
&\ldots\textrm{ a little bit of algebra later }\\\
&=\sum_{\mathbf{t}\in T}\left(-\lambda_\mathbf{t}\left(\boldsymbol\theta\right) + y_\mathbf{t}\log\left(\lambda_\mathbf{t}\left(\boldsymbol\theta\right)\right)\right)-\log\left(y_\mathbf{t}!\right)
\end{align*}

What is the next step to take in terms of the derivatives? I'm not sure how to take derivatives with respect to $\boldsymbol\theta$ (i.e., what is the resulting type from $\frac{\mathrm{d}}{\mathrm{d}\,\boldsymbol\theta}\left(-\lambda_\mathbf{t}\left(\boldsymbol\theta\right)\right)$; is it a matrix, a vector, etc.). I would appreciate it if people's answers gave as little away about the problem as possible, I'd like to be able to finish deriving the equation myself; I just need a little push in the right direction. Much appreciated!

Best Answer

Well, the $\log( y_{\bf t}!)$ terms don't involve ${\boldsymbol \theta}$, so forget about them. Multivariate derivatives are just concatenations of univariate partial derivatives. By linearity, the elements of the gradient vector are

$$ \frac{ \partial \ell( {\boldsymbol \theta} )}{ \partial \theta_{i}} = \sum_{ {\bf t} \in \mathcal{T} } \frac{ -\partial \lambda_{{\bf t}}({\boldsymbol \theta})}{ \partial \theta_{i}} + y_{{\bf t}} \cdot \frac{ \partial \log (\lambda_{{\bf t}}({\boldsymbol \theta})) }{ \partial \theta_{i}} $$

Given by your expression for $\lambda_{{\bf t}}({\boldsymbol \theta})$,

$$\frac{ \partial \lambda_{{\bf t}}({\boldsymbol \theta})}{ \partial \theta_{i}} = f_{i}( {\bf t}), $$

since you're just differentiating a linear function of $\theta_{i}$. From basic single variable calculus we know that

$$ \frac{ \partial \log(f(x)) }{\partial x} = \frac{1}{f(x)} \cdot \frac{ \partial f(x) }{\partial x}$$

So,

$$ \frac{\partial \log (\lambda_{{\bf t}}({\boldsymbol \theta})) }{ \partial \theta_{i}} = \frac{ f_{i}( {\bf t}) }{\sum_{k=1}^{d}\theta_k f_k\left(\mathbf{t}\right)} $$

Plug these parts back into the first equation above to get the score function.