Note that your second expression is just a special case of the first expression, where $n=1$. Hence it is sufficient to analyse your first assertion for a general $n\geq 1$ and see what happens in the case $n=1$
If you just look at a single observation (i.e $X_1$) instead of all observations (i.e. $X_1,...,X_n$) you are obviously discarding a lot of information which could be needed to estimate something unkown more precisely.
Suppose you are given an i.i.d sample $X_1,\dots,X_n$, $n\geq 1$, which are all sampled from a $Poisson(\lambda)$-distribution, with $\lambda$ unknown. Their joint density would be:
\begin{align*}
P(X_1=k_1, \dots, X_n=k_n)& = P(X_1=k_1) \cdot P(X_2=k_2) \cdot \ldots \cdot P(X_n=k_n)\\
& =\prod_{i=1}^n\frac{\lambda^{k_i}}{{k_i}!} \exp(-\lambda)
\end{align*}
which depends on the unknown parameter $\lambda$.
The idea of maximum likelihood is to look at the joint density function as a function of the unknown parameter $\lambda$ and maximize this target over all possible values of $\lambda$.
To better understand why we should use the joint density and not the "marginal" density of single observation we have to take a look at the result.
It is well known that the maximum likelihood estimator in the current case is
$\widehat{\lambda}_n = \frac{\sum_{i=1}^nX_i}{n}.$
But note, we have (since the $X_1,\dots, X_n$ are i.i.d):
$$E(\widehat{\lambda}_n) = \lambda$$ as well as $$Var(\widehat{\lambda}_n) = \frac{\lambda}{n}.$$
From this it is clear that $\widehat{\lambda}_n$ is an unbiased estimate for $\lambda$ for all $n$ (since $E(\widehat{\lambda}_n)$ does not depend on $n$) but the variance of this estimator will decrease with the sample size.
Hence using all $n$ observations from the sample and not only a single one (i.e. $n=1$) will lead to a "better/more precise" estimator! (this tells you: don't maximize your second assertion, since you can do better by maximizing your first assertion!)
It turns out that throwing away information (looking at $n=1$ instead of $n>1$) is not a good idea. This is very often the case in statistics.
Use derivatives. As a function of $\lambda$, you want to find a maximum of the function
$$l(\lambda) = \sum_{j=1}^N\bigg[-λ-\log_e(x_j!)+x_j\log_eλ\bigg].$$ Consider the $x_j$'s to be constants.
If you use calculus, maximum (if it exists...) occurs at a point of zero derivative. The equation
$$\frac{\text{d}l}{\text{d}\lambda} = \sum_{j=1}^N (-1 + \frac{x_j}{\lambda}) = 0.$$ has the only solution
$$
\lambda = \sum_j \frac{x_j}{N}.
$$
Not surprisingly, this is the mean of the numbers $x_j$.
Best Answer
pmf of a poisson $Po(\phi)$ is
$$P(X=x)=\frac{e^{-\phi}\phi^x}{x!}$$
but as the likelihood depends on the parameter $\phi$ we can say also that
$$L(\theta)\propto e^{-\phi}\phi^x$$
Thus your likelihood becomes
$$L(\phi)\propto e^{-\phi}\phi^2e^{-2\phi}(2\phi)^4$$
taking its log, after some easy algebraic manipulations you get
$$l(\phi)=-3\phi+6\log\phi$$
This expression is equivalent to the one you are requesting to show as loglikelihoods are equivalent but an additive constant (the expression in your [ ] brackets)
Of course there is an evident typo in your statement: your $-3\log(\phi)$ is evidently $-3\phi$... mine is correct!