For any (local) observable $\mathcal{O}$, its expectation value is defined as
$$ \langle \mathcal{O} \rangle = \frac{1}{Z}\int \mathcal{D}\Phi\mathcal{O}\mathrm{e}^{-S_E[\Phi]}$$
where
$$ Z = \int \mathcal{D}\Phi\mathrm{e}^{-S_E[\Phi]}$$
is indeed called the partition function. My writing $S_E$ is intended to show that this is for a Euclidean theory (i.e. a theory on a Riemannian manifold), for a Minkowskian theory (i.e. a theory on a Lorentzian manifold), one would have to add some $\mathrm{i}$ by Wick rotation. I have used $\mathcal{D}\Phi$ instead of $[\mathrm{d}\Phi]$ to denote the measure on the space of field configurations, both notations are widely used. It is important to note that, in most cases, this measure cannot be rigorously constructed and is only defined in some regularization procedure, but naively you can think of it as any kind of integration over the space of all possible field configurations $\Phi : \Sigma \rightarrow \mathbb{R}$, if $\Sigma$ is your spacetime manifold.
Now, your intuition about this thing are basically correct - $\mathrm{e}^{-S_E[\Phi]}$ is indeed a Boltzmann-like weighting factor, that, in the QFT case, weighs different field configurations $\Phi$. It is obvious that this factor will be maximal at minimal $S_E[\Phi]$, so the classically realized field configuration $\Phi_0$ for which $S_E[\Phi_0]$ is a minimum of the action contributes the most to this integral. The expectation value of the observable is indeed defined as an exact analogon to classical statististical mechanics.
You write something about a transformation, and from the context of Ward identities, I am assuming that you are talking about a gauge transformation, which should be a symmetry of the theory. Explicitly, the transformation is infintesimally given by
$$ \Phi \mapsto (1 + \mathrm{i}\omega_a(x)T^a)\Phi \equiv \Phi' \tag{1}$$
where the $T^a$ are generators of the symmetry group.
Being a symmetry of the theory means that no observable may change its value under the transformation. We denote quantities after transformation by primes. Looking at some observable $\mathcal{O}$ (your $X$), we must have that
$$ \langle \mathcal{O} \rangle = \frac{1}{Z}\int \mathcal{D}\Phi\mathcal{O}\mathrm{e}^{-S_E[\Phi]} \overset{!}{=} \frac{1}{Z'}\int \mathcal{D}\Phi' (\mathcal{O} + \delta\mathcal{O})\mathrm{e}^{-S_E[\Phi]-\int\mathrm{d}^d x \partial_\mu j^\mu_a \omega_a(x)}$$
We now assume that our symmetry is non-anomalous, i.e. $\mathcal{D}\Phi' = \mathcal{D}\Phi$. Then, for the equality to hold, we have:
$$\int \mathcal{D}\Phi\mathcal{O}\mathrm{e}^{-S_E[\Phi]} \overset{!}{=} \int \mathcal{D}\Phi (\mathcal{O} + \delta\mathcal{O})\mathrm{e}^{-S_E[\Phi]-\int\mathrm{d}^d x \partial_\mu j^\mu_a \omega_a(x)}$$
If you now expand $\mathrm{e}^{-\int\mathrm{d}^d x \partial_\mu j^\mu_a \omega_a(x)}$ to first order in $\omega_a$, you will get $\langle \mathcal{O} \rangle$ on both sides of the equation, which cancels, and the remaining terms are precisely the Ward identities, i.e. the first equation you asked about.
The second equation follows from doing the explicit transformation on $\mathcal{O}$: (Assuming, as you said, that the $G_a$ are the generators $T^a$ of the transformation): Look again at $(1)$. Since you said that $\mathcal{O}$ is a collection of fields, it is
$$ \mathcal{O} = \prod_{i=1}^n \Phi(x_i)$$
for some $n$. Now, carry out $(1)$ on every field $\Phi$ and keep only the terms at most first order in $\omega_a$. This is exactly your $\delta \mathcal{O}$.
(This is "exact" since we consider an infinitesimal gauge transformation, anyways. There are rigorous foundations in Lie theory for this "throwing away" of higher order terms. Nevertheless, it is quite important to carry these tricks the physicist likes so much oneself, since the answer coming out is correct.)
Although there is already an accepted answer, I'll provide a different point of view here.
It is important to keep in mind that the imaginary time is periodic, which means fields satisfy periodic(or anti-periodic) boundary conditions (the reason is because we use imaginary time path integral to represent the partition function, which is $\mathrm{Tr}\, e^{-\beta H}$). Therefore, the correlation functions also obey the same kind of periodic conditions under the shift $\tau\rightarrow \tau+\beta$, which is why the Matsubara frequencies are $\frac{2\pi n}{\beta}$ for $n\in\mathbb{Z}$ for bosonic fields, and $\frac{(2n+1)\pi}{\beta}$ for fermionic fields. So if $|\tau_A-\tau_B|$ is larger than $\beta$, all you need to do is to fold it back by adding or subtracting several $\beta$'s, just as one would do to any periodic functions.
EDIT: The periodicity of Green's function can be explicitly derived, without using the path integral representation. After all, the path integral representation is designed to be consistent with the operator formalism.
Let's assume $A$ and $B$ are bosonic, so I do not need to keep track of the fermionic sign. For concreteness, consider $-\beta<\tau<0$ and
$$
G(\tau)=\langle \mathcal{T} A(\tau)B(0)\rangle=\mathrm{Tr}[e^{-\beta H}B(0)e^{\tau H}A(0)e^{-\tau H}]=\mathrm{Tr}[ e^{\tau H}A(0)e^{-\tau H}e^{-\beta H}B(0)]\\
=\mathrm{Tr}[e^{-\beta H}e^{(\tau+\beta) H}A(0)e^{-(\tau+\beta) H}B(0)]=\langle \mathcal{T} A(\tau+\beta) B(0)\rangle
=G(\tau+\beta)
$$
Here we use the cyclic property of the trace.
Best Answer
Yes, in scalar field theory, $\langle 0 | T\{\phi(y) \phi(x)\} | 0 \rangle$ is the amplitude for a particle to propagate from $x$ to $y$. There are caveats to this, because not all QFTs admit particle interpretations, but for massive scalar fields with at most moderately strong interactions, it's correct. Applying the operator $\phi({\bf x},t)$ to the vacuum $|0\rangle$ puts the QFT into the state $|\delta_{\bf x},t \rangle$, where there's a single particle whose wave function at time $t$ is the delta-function supported at ${\bf x}$. If $x$ comes later than $y$, the number $\langle 0 | \phi({\bf x},t)\phi({\bf y},t') | 0 \rangle$ is just the inner product of $| \delta_{\bf x},t \rangle$ with $| \delta_{\bf y},t' \rangle$.
However, the function $f(x,y) = \langle 0 | T\{\phi(y) \phi(x)\} | 0 \rangle$ is not actually a correlation function in the standard statistical sense. It can't be; it's not even real-valued. However, it is a close cousin of an honest-to-goodness correlation function.
If make the substitution $t=-i\tau$, you'll turn the action $$iS = i\int dtd{\bf x} \{\phi(x)\Box\phi(x) - V(\phi(x))\}$$ of scalar field theory on $\mathbb{R}^{d,1}$ into an energy function $$-E(\phi) = -\int d\tau d{\bf x} \{\phi(x)\Delta\phi(x) + V(\phi(x))\}$$ which is defined on scalar fields living on $\mathbb{R}^{d+1}$. Likewise, the oscillating Feynman integral $\int \mathcal{D}\phi e^{iS(\phi)}$ becomes a Gibbs measure $\int \mathcal{D}\phi e^{-E(\phi)}$.
The Gibbs measure is a probability measure on the set of classical scalar fields on $\mathbb{R}^{d+1}$. It has correlation functions $g(({\bf x}, \tau),({\bf y},\tau')) = E[\phi({\bf x}, \tau)\phi({\bf y},\tau')]$. These correlation functions have the property that they may be analytically continued to complex values of $\tau$ having the form $\tau = e^{i\theta}t$ with $\theta \in [0,\pi/2]$. If we take $\tau$ as far as we can, setting it equal to $i t$, we obtain the Minkowski-signature "correlation functions" $f(x,y) = g(({\bf x},it),({\bf y},it'))$.
So $f$ isn't really a correlation function, but it's the boundary value of the analytic continuation of a correlation function. But that takes a long time to say, so the terminology gets abused.