Quantum mechanics and quantum field theory are different in how they treat their wave equations. The usage of the common term “propagator” could be traced back to the “relativistic wave equation” approach—i. e. people really used to think of the Schrödinger and the KG operators as belonging to the same class of “quantum operators”, but the modern point of view regards these things as being of different nature, so I suggest you do too at first. (Later, you might want to understand the Schrödinger field in non-relativistic QFT by reading chapter III.5 of Zee, and if you’re feeling brave, the origins of modern QFT, described in Weinberg’s first volume, section 1.2.) Accordingly, I will divide my answer into sections on QFT and QM.
Quantum mechanics. Assume you know the transition amplitude
$\def\xi{x_{\mathrm i}} \def\xf{x_{\mathrm f}} \def\ti{t_{\mathrm i}} \def\tf{t_{\mathrm f}}$
$$K(\xf,\tf;\xi,\ti) \equiv \langle \xf,\tf|\xi,\ti\rangle$$
and the wave function $\psi(x,\ti) = \psi_0(x)$ for all $x$ at a certain time $t =\ti$. Then you also know it at any other time $t=\tf$:
\begin{multline}
\psi(\xf,\tf) \equiv \langle \xf,\tf|\psi(\ti)\rangle = \langle \xf,\tf|\left(\int d^n\xi\,|\xi,\ti\rangle\langle \xi,\ti|\right)|\psi(\ti)\rangle \\\equiv \int d^n\xi\,K(\xf,\tf;\xi,\ti)\psi_0(\xi).\tag{1}
\end{multline}
The first few sections of Feynman and Hibbs or chapter 6 of SrednickiPDF should convince you that
$$K(\xf,\tf;\xi,\ti) = \int\limits_{x\rlap{(\ti)=\xi}}^{x\rlap{(\tf)=\xf}} \mathcal Dx(t)\,e^{i\int dt\,L(x(t),\dot x(t),t)}.$$
Note well the boundary conditions in the path integral: they will prove to be important in the QFT section.
Let us rearrange the arguments, $K(\xf,\tf;\xi,\ti) = K(\xf,\xi;\tf,\ti)$. Then you will be able to recognize in (1) an integral representation of the evolution operator $U(\tf,\ti)$,
$$ \psi(\xf,\tf)\equiv(U(\tf,\ti)\psi_0)(\xf) = \int d^n\xi\,K(\xf,\xi;\tf,\ti)\psi_0(\xi).$$
If you think of $\xi$ and $\xf$ as indices with continuous number of values, this formula looks very much like matrix multiplication and $K(\cdot,\cdot\,;\tf,\ti)$ plays the role of the matrix. This makes sense, because the linear operator $U(\tf,\ti)$ should be represented by (something like) a matrix! Mathematicians call that something an (integral) kernel, hence the $K$. But it really is a very big matrix, barring pathologies; speaking of which, convince yourself that the Dirac delta “function” $\delta^n(\xf-\xi)$ is the kernel of the identity transform and that $K(\xf,\xi;\ti,\ti) = \delta^n(\xf-\xi)$.
Armed with the knowledge that Dirac delta is in fact the identity operator, you now see that the definition of the Green’s function (more properly the fundamental solution) of a linear differential operator $L$, limited to zero space dimensions and no explicit $t$ dependence for simplicity,
$$ LG = \delta(t), $$
is in fact just the definition of an inverse! Given $G$, it’s also obvious how to solve any other inhomogeneous equation:
$$ Lu = f(t)\quad\Leftarrow\quad u(t) = \int ds\,G(t-s)f(s). $$
But what does it all have to do with the solution to the boundary value problem that is the propagator $K$? Everything, it turns out, according to Duhamel’s principle. The Green’s function $G$ (for the inhomogeneous problem) and the propagator $K$ (for the initial value problem) are in fact the same! A discussion at Math.SE provides some motivation, and Wikipedia has the details on handling equations that are of more than first order in time (e. g. KG not Schrödinger). In any case, the end result is that $K$ above is the inverse of the Schrödinger operator,
$$[\partial_t + iH(x,-i\partial_x)]K = \delta(t)\delta^n(x).$$
Interlude. You might enjoy reading section 2 of Feynman’s classic paper Theory of positronsPDF, Phys. Rev. 76, 749 (1949), and the beginning of section 2 of the follow-up Space-time approach to quantum electrodynamicsPDF, Phys. Rev. 76, 769 (1949), which provides the link between the QM and QFT approaches by showing how to write a perturbation expansion in $g$ for a Hamiltonian $H = T + gV$ when you can determine the exact evolution under the “kinetic” part $T$ but not the “interaction” part $gV$, $g \ll 1$. The first-order contribution, for example, ends up looking like
$$ K_1(\xf,\tf;\xi,\ti) = -ig\int_{\ti}^{\tf} dt\int d^3x\, K_0(\xf,\tf;x,t)V(x,t)K_0(x,t;\xi,\ti), $$
which can reasonably be described as “propagating to an arbitrary point $x$, scattering off the potential and propagating to the final point from there”. The second paper has the extension to multi-particle systems.
Feynman used this to motivate, for the very first time, his diagrams. The part pertaining to QED itself should be taken with a grain of salt, however, for the reasons stated in the first paragraph. You’d have a lot of fun, for example, explaining why the restriction $\ti\le t\le \tf$ is not enforced in QFT—Feynman called this the reason for antiparticles.
Quantum field theory. Vernacular (as opposed to axiomatic) quantum field theory starts with a classical field equation. That is what your KG or Dirac or wave equation is: a classical equation derived from a classical action for the field. You can split the equation and the action into a “free” and an “interaction” part; the free (or “kinetic”) part is usually defined as the part you’re able to solve exactly—the linear part of the equation, the quadratic part of the action. The free propagator is then the inverse of that part. It is usually called $D$ for fermionic and $\Delta$ for bosonic fields, although conventions (and coefficients!) vary.
Promote the fields to operators $\hat\phi(x)$, using canonical quantization; after some pain and suffering you’ll find the totally mysterious fact that, in the free theory,
$$ \langle0|\mathcal T\hat\phi(x)\hat\phi(y)|0\rangle = \theta(x^0-y^0)\langle0|\hat\phi(x)\hat\phi(y)|0\rangle +\theta(y^0-x^0)\langle0|\hat\phi(y)\hat\phi(x)|0\rangle =\frac 1i \Delta(x-y), $$
where $|0\rangle$ is the ground state, and the first equality serves to define the $\mathcal T$ symbol, the time ordering. However, in the land of functional integrals, this whole thing is as easy as solving the quadratic equation $ax^2 + bx + c = 0$ by completing the square; you can find the details in Zee chapter I.2, starting with equation (19). The result is
$$ \frac 1i \Delta(x-y) = \langle0|\mathcal T\hat\phi(x)\hat\phi(y)|0\rangle \equiv \int\mathcal D\phi(x)\,\phi(x)\phi(y)\,e^{iS[\phi]} = \left.\frac\delta{i\,\delta J(x)}\frac\delta{i\,\delta J(y)}\int\mathcal D\phi(x)\,e^{i\left(S[\phi] + \int d^4x\,J(x)\phi(x)\right)}\right|_{J = 0}, $$
with the equivalence in the middle being nearly the definition of the integral, and the whole thing should look reasonable and not coincidentially reminiscent of statistical physics. Note how the integration is over the four-dimesional field configurations $\phi(x)$ instead of particle paths $x(t)$: QM is just QFT in one dimension!
You have to derive the path integral to understand where the $\mathcal T$ comes from—however, it makes sense that if the path integral defines correlators, they should come with an ordering prescription: under the integral sign, there are no operators, only numbers, and no ordering. The derivation will also convince you that (remember how I told you to mind the boundary conditions?)
$$ \int\mathcal Dx(t) \equiv \int d^n\xf\,d^n\xi \langle0|\xf\rangle \langle\xi|0\rangle \int\limits_{x\rlap{(\ti) =\xi}}^{x\rlap{(\tf) =\xf}}\mathcal Dx(t) $$
for arbitrary $\ti$ and $\tf$ that encompass all the time values you are interested in, in one dimension for simplicity. I recently had to write down the details so you can consult my notes to selfPDF if necessary.
The final leap is to introduce interactions; I’ll leave that for the AMS notes or Zee chapter I.7, but the idea is again (functionally) differentiating under the (functional) integral:
$$
\int\mathcal D\phi(x) e^{i\left(S[\phi] + I[\phi] + \int d^4x\,J(x)\phi(x)\right)}
= e^{iI\left[\frac{\delta}{i\,\delta J}\right]} \int\mathcal D\phi(x) e^{i\left(S[\phi] + \int d^4x\,J(x)\phi(x)\right)}
$$
and the result is vertices in Feynman diagrams.
Best Answer
I) OP is right, ideologically speaking. Ideologically, OP's first eq.
$$ \tag{1} \left| \int_{\mathbb{R}}\! \mathrm{d}x_f~K(x_f,t_f;x_i,t_i) \right| ~\stackrel{?}{=}~1 \qquad(\leftarrow\text{Turns out to be ultimately wrong!}) $$
is the statement that a particle that is initially localized at a spacetime event $(x_i,t_i)$ must with probability 100% be within $x$-space $\mathbb{R}$ at a final time $t_f$, as our QM model does not allow creation or annihilation of particles.
However, such notion of absolute probabilities of the Feynman kernel $K(x_f,t_f;x_i,t_i)$ cannot be maintained when ideology has to be converted into mathematical formulas. E.g. for the harmonic oscillator, one has
$$\tag{A} \left| \int_{\mathbb{R}}\!\mathrm{d}x_f ~ K(x_f,t_f;x_i,t_i)\right| ~=~\frac{1}{\sqrt{\cos\omega \Delta t}}, \qquad \Delta t ~:=~t_f-t_i,$$
which only becomes unity for $\omega \Delta t \to 0$. The problem can ultimately be traced to the fact that there is no normalizable uniform probability distribution on the real axis $\mathbb{R}$, i.e. the $x$-position space. In general, OP's first eq. (1) only holds for short times $\Delta t\ll \tau$, where $\tau$ is some characteristic time scale of the system.
II) Let us review how normalization appears in the Feynman path integral from first principles. The main tool to determine the Feynman propagator/kernel/amplitude $K(x_b,t_b;x_a,t_a)$ is the (semi)group property
$$\tag{B} K(x_f,t_f;x_i,t_i) ~=~ \int_{\mathbb{R}}\!\mathrm{d}x_m ~ K(x_f,t_f;x_m,t_m) K(x_m,t_m;x_i,t_i). $$
III) Equivalently, if we identify
$$\tag{C} K(x_f,t_f;x_i,t_i)~=~\langle x_f,t_f \mid x_i,t_i \rangle$$
with an overlap of instantaneous$^1$ position eigenstates in the Heisenberg picture, then eq. (B) follows from the (first of) the completeness relations
$$\tag{D} \int \!\mathrm{d}x ~|x,t \rangle \langle x,t |~=~{\bf 1}, \qquad \text{and} \qquad \int \!\mathrm{d}p~ |p,t \rangle \langle p,t |~=~{\bf 1}.$$
These instantaneous position and momentum eigenstates have overlap$^2$
$$\tag{E} \langle p,t \mid x,t \rangle~=~\frac{1}{\sqrt{2\pi\hbar}}\exp\left[\frac{px}{i\hbar}\right].$$
IV) OP's first eq. (1) is equivalent to the statement that
$$\tag{F} \left| \langle p_f=0,t_f \mid x_i,t_i \rangle \right| ~\stackrel{?}{=}~\frac{1}{\sqrt{2\pi\hbar}},\qquad(\leftarrow\text{ Ultimately wrong!}) $$
due to the identification (C) and
$$\tag{G} \langle p_f,t_f \mid x_i,t_i \rangle ~\stackrel{(D)+(E)}{=}~\int_{\mathbb{R}}\!\frac{\mathrm{d}x_f}{\sqrt{2\pi\hbar}}\exp\left[\frac{p_fx_f}{i\hbar}\right] \langle x_f,t_f \mid x_i,t_i \rangle. $$
Eq. (F) is violated for e.g. the harmonic oscillator, where one has
$$\tag{H} \left| \langle p_f,t_f \mid x_i,t_i \rangle \right| ~=~\frac{1}{\sqrt{2\pi\hbar\cos\omega \Delta t}}. $$
V) For sufficiently short times $\Delta t\ll \tau$, one derives from the Hamiltonian formulation (without introducing arbitrary normalization/fudge factors!) that
$$\begin{align} \langle x_f,t_f \mid x_i,t_i\rangle ~\stackrel{(D)}{=}~&\int_{\mathbb{R}} \!\mathrm{d}p~ \langle x_f,t_f \mid p,\bar{t} \rangle \langle p,\bar{t} \mid x_i,t_i\rangle \cr ~=~&\int_{\mathbb{R}} \!\mathrm{d}p~\langle x_f,\bar{t} \mid \exp\left[-\frac{i\Delta t}{2\hbar}\hat{H}\right]\mid p,\bar{t} \rangle \langle p,\bar{t} \mid \exp\left[-\frac{i\Delta t}{2\hbar}\hat{H}\right]\mid x_i,\bar{t}\rangle\cr ~\approx~&\int_{\mathbb{R}} \!\mathrm{d}p~ \langle x_f,\bar{t} \mid p,\bar{t} \rangle \langle p,\bar{t} \mid x_i,\bar{t}\rangle \exp\left[-\frac{i\Delta t}{\hbar} H(\bar{x},p) \right]\cr ~\stackrel{(E)}{=}~& \int_{\mathbb{R}} \!\frac{\mathrm{d}p}{2\pi\hbar} \exp\left[\frac{i}{\hbar}\left(p\Delta x -\left(\frac{p^2}{2m} + V(\bar{x})\right)\Delta t\right) \right]\cr ~=~& \sqrt{\frac{A}{\pi}} \exp\left[-A(\Delta x)^2-\frac{i}{\hbar}V(\bar{x})\Delta t\right], \qquad A~:=~\frac{m}{2 i\hbar} \frac{1}{\Delta t},\cr ~=~&\sqrt{\frac{m}{2\pi i\hbar} \frac{1}{\Delta t}} \exp\left[ \frac{i}{\hbar}\left(\frac{m}{2}\frac{(\Delta x)^2}{\Delta t}-V(\bar{x})\Delta t\right)\right],\end{align}\tag{I}$$
where
$$\tag{J} \Delta t~ :=~t_f-t_i, \quad \bar{t}~ :=~ \frac{t_f+t_i}{2}, \quad \Delta x~ :=~x_f-x_i, \quad \bar{x}~ :=~ \frac{x_f+x_i}{2} .$$
The oscillatory Gaussian integral (I) over momentum $p$ was performed by introducing the pertinent $\Delta t\to\Delta t-i\epsilon$ prescription. Eq. (I) implies that
$$\tag{K} K(x_f,t_f;x_i,t_i) ~\longrightarrow~\delta(\Delta x) \quad \text{for} \quad \Delta t \to 0^{+}, $$
which in turn implies OP's first eq. (1) in the short time limit $\Delta t \to 0^{+}$. More generally, Eq. (I) implies OP's first eq. (1) for $\Delta t\ll \tau$.
VI) Note that the short time probability
$$\tag{L} P(x_f,t_f;x_i,t_i)~=~|K(x_f,t_f;x_i,t_i)|^2~\stackrel{(I)}{\approx}~\frac{m}{2\pi \hbar} \frac{1}{\Delta t} , \qquad \Delta t\ll \tau, $$
is independent of initial and final positions, $x_i$ and $x_f$, respectively. For fixed initial position $x_i$, the formula (L) can be interpreted as a uniform and unnormalizable probability distribution in the final position $x_f\in\mathbb{R}$. This reflects the fact that the instantaneous eigenstate $|x_i,t_i \rangle$ is not normalizable in the first place, and ultimately dooms the notion of absolute probabilities.
VII) For finite times $\Delta t$ not small, the interaction term $V$ becomes important. In the general case, the functional determinant typically needs to be regularized by introducing a cut-off and counterterms. But regularization is not the (only) source of violation of OP's first eq. (1), or equivalently, eq. (F). Rather it is a generic feature that the $px$ matrix elements of an unitary evolution operator
$$\tag{M} \frac{\langle p,t \mid \exp\left[-\frac{i\Delta t}{\hbar}\hat{H}\right] \mid x,t\rangle}{\langle p,t \mid x,t\rangle} $$
is not just a phase factor away from the short time approximation $\Delta t\ll \tau$.
VIII) Example: Consider the Hermitian Hamiltonian
$$\tag{N} \hat{H}~:= \frac{\omega}{2}(\hat{p}\hat{x}+\hat{x}\hat{p}) ~=~ \omega(\hat{p}\hat{x}+\frac{i\hbar}{2}). $$
Then
$$ \begin{align}\frac{\langle p,t \mid \exp\left[-\frac{i\Delta t}{\hbar}\hat{H}\right] \mid x,t\rangle}{\langle p,t \mid x,t\rangle} ~=~&1 - \omega\Delta t\left(\frac{1}{2}-i\frac{px}{\hbar} \right)\cr &+\frac{(\omega\Delta t)^2}{2}\left(\frac{1}{4}-2i\frac{px}{\hbar} - \left(\frac{px}{\hbar} \right)^2\right) +{\cal O}\left((\omega\Delta t)^3\right),\end{align}\tag{O} $$
which is not a phase factor if $\omega\Delta t\neq 0$. To see this more clearly, take for simplicity $px=0$.
References:
R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals, 1965.
J.J. Sakurai, Modern Quantum Mechanics, 1994, Section 2.5.
--
$^1$ Instantaneous eigenstates are often introduced in textbooks of quantum mechanics to derive the path-integral formalism from the operator formalism in the simplest cases, see e.g. Ref. 2. Note that the instantaneous eigenstates $\mid x,t \rangle $ and $\mid p,t \rangle $ are time-independent states (as they should be in the Heisenberg picture).
$^2$ Here we assume that possible additional phase factors in the $px$ overlap (E) have been removed via appropriate redefinitions, cf. this Phys.SE answer.