Quick answer
My question is how does the presence of nonzero $J(x)$ results in a non-trivial spacetime dependent value of $\langle 0|\phi(x)|0\rangle$?
The equation $\phi(x)=\mathrm e^{-iPx}\phi(0)\mathrm e^{iPx}$ works both for $J=0$ and $J\neq 0$. Therefore,
$$
\langle \phi(x)\rangle_J= {}_J\langle 0|\mathrm e^{-iPx}\phi(0)\mathrm e^{iPx}|0\rangle_J
$$
What it is not true when $J\neq 0$ is that $P|0\rangle_J\stackrel{\text{no}}{=}0$ (because the source breaks the invariance), and therefore we cannot conclude that
$$
\langle \phi(x)\rangle_J\stackrel{\text{no}}{=} {}_J\langle 0|\phi(0)|0\rangle_J
$$
Therefore, if $J\neq 0$ the vev depends on position $x$.
To find the explicit dependence of $\langle \phi(x)\rangle_J$ with $x$, instead of using operators, it is easier to work with path integrals:
$$
\langle \phi(x)\rangle_J=\frac{\delta}{\delta J(x)}\exp\left[-i\int \mathrm dy\,\mathrm dz\ J^*(y)\Delta(y-z)J(z)\right]
$$
which I believe you can calculate yourself (note that the result is proportional to $J(x)$ and so the vev goes to zero as $J\to 0$, as expected).
The (somewhat) bigger picture
The first thing we have to do is to differentiate from internal sources and external ones:
An internal source is a term in the lagrangian that only includes dynamical fields, that is, fields that are part of the equations of motion. For example, you can have a KG theory,
$$
\mathcal L\sim (\partial\phi)^2-m^2\phi^2+g\phi^3
$$
where the last term can be said to be an internal source (though the usual terminology is just interaction). This term is internal because it only depends on $\phi$, which is itself a dynamical field (determined from the EoM's). Another (more illustrative) example is the lagrangian for QED,
$$
\mathcal L\sim \bar\psi(i\not\partial-m)\psi-F^2+eA_\mu \bar\psi\gamma^\mu\psi
$$
Again, the last term is an internal source, because it only depends on dynamical fields, $\psi$ and $A$, which are determined from the EoM. I would like to stress that in general people don't say "internal source" but "interaction" instead.
An exernal source is a function in the lagrangian that is externally determined (fixed), that is, a function that is not dynamical (there is not an equation of motion for that function). Typical examples are the $J$'s that are used in path integrals,
$$
\mathcal L\sim (\partial\phi)^2-m^2\phi^2+g\phi^3+\phi(x)J(x)
$$
and fixed (background) functions in effective theories, such as, for example, the electromagnetic field in a low energy treatment of the Hydrogen atom:
$$
\mathcal L\sim \bar\psi(i\not\partial-m)\psi+eA_\mu \bar\psi\gamma^\mu\psi
$$
(here, $A_\mu$ is an external source, because there is not a kinetic term $F^2$ for it, and so the value of $A$ has to be written by hand, say, a Coulomb potential $A_0\sim e/r$.
Note that external sources break the translational invariance of the theory (because of the obvious reason: an external source has a fixed dependence on position, and so the "physics don't look the same everywhere"). Therefore, if there are external sources, $P_\mu|0\rangle\neq 0$ and vev's depend on position, as discussed in the first part of this answer.
On the other hand, internal sources don't break the translational invariance of the theory, because the sources themselves transform together with the fields. This might be easier to understand with an example. Consider first a theory with only internal sources:
$$
S=\int\mathrm dx\ (\partial\phi(x))^2-m^2\phi(x)^2-g\phi(x)^3
$$
which, upon a translation $x\to x-a$ transforms into
$$
S_a=\int\mathrm dx\ (\partial\phi(x-a))^2-m^2\phi(x-a)^2-g\phi(x-a)^3
$$
which is the same as before, $S_a=S$, because we integrate over all space and $\mathrm d(x-a)=\mathrm dx$.
On the other hand, consider a theory with an external source:
$$
S=\int\mathrm dx\ (\partial\phi(x))^2-m^2\phi(x)^2-\phi(x)J(x)
$$
which, upon a translation $x\to x-a$ transforms into
$$
S_a=\int\mathrm dx\ (\partial\phi(x-a))^2-m^2\phi(x-a)^2-\phi(x-a)J(x)
$$
which is not the same as before, because of the $J(x)$ term. The action is not the same as before, and so the translation changed the theory. At this point, you might want to read this answer of mine. In the notation of that post, the $(2)$ derivative of a lagrangian with external sources is non-zero.
To recapitulate,
If there are only internal sources, then the theory is translationally invariant, and so all the vev's are position independent (as can be easily shown using $P_\mu|0\rangle=0$ and $Q_\alpha(x)=\mathrm e^{-iPx}Q_\alpha(x)\mathrm e^{iPx}$, where $Q_\alpha(x)$ is any field). Most of the times we redefine every field $Q_\alpha(x)\to Q_\alpha(x)-\langle Q\rangle$ so that all the vev's are zero (this is relevant for renormalisation). In some cases (e.g., in the case of the Higgs field) a non-zero vev is physically relevant (but only makes sense because of the form of the lagrangian for the Higgs field, and wouldn't make sense for, say, a standard KG field). In any case, if the sources are internal then vev's are constant.
If there are only external sources, then the theory is free. Therefore, the vev's depend on position, but in the limit $J\to 0$ we must have $\langle\phi\rangle\to 0$, as it must be for a free theory.
If there are internal and external sources, the vev's are position-dependent and don't go to zero as the external sources go to zero (and therefore we must renormalise the fields).
In your example there's two contractions giving two terms. Its amplitude is
$$ \left< \psi^{\dagger}_a(x) \psi^b(y) \phi(z) \right> = -i \lambda \int d^4 s \, \Delta_M(z - s) \Delta_m(y-s)^b_{\;c} \Delta_m(x-s)^c_{\;a} $$
$$ - i \lambda \int d^4 s \Delta_M(x - y) ^b_{\;a} \Delta_M(0)^c_{\;c} \Delta_m(s-x) + \mathcal{O}\left(\lambda^2\right). $$
How can I see that this is true? The algorithm is pretty simple, actually:
You have a term corresponding to each possible contraction. A contraction is a diagram where pairs of $(\psi^{\dagger}, \psi)$ and $(\phi, \phi)$ are replaced by the corresponding propagators.
You have to exclude the diagrams containing bubble subgraphs. A bubble graph is a graph with no external legs. An external leg is a contraction which has one of the fields appearing in the correlation bracket ($\psi^{\dagger}(x), \psi(y), \phi(z)$). This is because we want to account for the normalization $\mathcal{N}/\mathcal{N}_0$ as I mentioned in the answer to your previous question. Proofs of this can be found in any QFT textbook, e.g. Peskin-Schreder.
For each internal (interaction) vertex we have a factor of $$-i \lambda \int d^4 s. $$
Each term is a product of integrals over spacetime positions of internal (interaction) vertices and propagators.
In your example there's two $\phi$ fields, two $\psi$ fields and two $\psi^{\dagger}$ fields. Therefore we must contract $\phi$ with $\phi$, but we have a choice of which $\psi^{\dagger}$ gets contracted with specific $\psi$.
These correspond to the two diagrams below:
Both don't contain bubble subgraphs, however the second contains the tadpole divergence $\Delta_m(0)$. These divergences arise in QFT often. They have to be renormalized by requiring that $\left< \phi \right> = 0$, which is eqiuvalent to throwing away the tadpole contribution. (Actually, we don't just throw away mathematical expressions; we absorb them in the redefinition of fields).
The relevant part of your vertex amplitude is thus
$$ \left< \psi^{\dagger}_a(x) \psi^b(y) \phi(z) \right> = -i \lambda \int d^4 s \, \Delta_M(z - s) \Delta_m(y-s)^b_{\;c} \Delta_m(x-s)^c_{\;a}. $$
Let me know if you have further questions.
P.S. oh and I supposed that $\psi$ and $\psi^{\dagger}$ are field multiplets, thus the internal indices $a, b, c$ labeling the components of the multiplet. If it's just a complex number then just throw these indices away :)
Best Answer
Wick's theorem tells us that $$ \mathcal{T}(\phi_1\dots\phi_N) =\ :\phi_1\dots\phi_N: + :\text{pairwise contractions}:$$ where $:\ :$ is normal ordering. Immediately from the definition of normal ordering (all annihilators to the right, all creators to the left), the expectation value of anything that is normal-ordered and not a constant vanishes because the annihilators on the right and the creators on the left just give zero on the vacuum if there is even a single annihilator or creator in the expression to be normal-ordered.
If $N$ is odd, you'll always have an operator left in the normal-ordered expressions on the right of Wick's theorem, whose expectation value is thus zero.