I will start by apologising for a physicist's level of rigor in the following derivations but I think they give good insight into the connection between what I would call stochastic physics' Trinity of Langevin, Fokker-Planck and Path integral.
Fokker-Planck from Langevin equation
Given a Langevin equation $dZ_t=b(Z_t)dt+\sigma(Z_t)dW_t$ (the most common form found in physics) we can derive the Fokker-Planck equation from the Chapman-Kolmogorov equation in a quick and dirty way:
$$
p(y,t|x)=\int dz\, p(y,t|z)p(z,t'|x)
$$
We suppose that $t = \delta t$ and thus we can write $$p(y,t|z) = \left\langle \delta (y-z-h)\right\rangle_h$$ where $h=\delta Z$ is defined by the associated Langevin equation. Now we have
$$
p(y,t|x)=\int dz\, \left\langle \delta (y-z-h)\right\rangle_hp(z,t'|x)\,.\tag{$\ast$}
$$
We can then Taylor expand the delta function as
$$
\left(1+\left\langle h\right\rangle \frac{\partial}{\partial y} +\frac{1}{2}\left\langle h^{2}\right\rangle \frac{\partial^2}{\partial y^2} +\ldots\right)\delta(y-z) = (1+\mathscr{L})\delta(y-z)\,,
$$
then integrating by parts and the delta function we are left with
$$
p(y,t|x)=(1+\mathscr{L^\dagger})p(y,t'|x)\,.
$$
Expanding $p(y,t|x)= p(y,t'|x) + \frac{\partial}{\partial t}p(y,t'|x) + \ldots$ in the limit $t \rightarrow 0$ we arrive at the Fokker-Planck equation
$$
\frac{\partial p(y,t|x)}{\partial t}=\mathscr{L^\dagger}p(y,t|x)
$$
relabelling $t'$ as $t$.
Path integral from Langevin
We apply the Chapman-Kolmogorov equation many times:
$$
p(y,t|x)=\int\prod_{i=1}^{N}dz_{i}\, p(y,t_{N}|z_{N})\ldots p(z_{2},t_{2}|z_1)p(z_{1},t_{1}|x)
$$
And then use $(\ast)$ to replace each along with the identity $\delta(x) = \int dk \exp(ikx)$ we get a sequence of averages over complex exponentials:
$$
p(y,t|x)=\int Dz Dk\, \left\langle e^{ik_N(z_N-z_{N-1}-h_N)}\right\rangle\ldots \left\langle e^{ik_2(z_2-z_{1}-h_2)}\right\rangle \left\langle e^{ik_1(z_1-x-h_1)}\right\rangle
$$
however we can still do better. Knowing the probability distribution of $h$ (usually Gaussian in physics) we can write
$$
\left\langle O \right\rangle = \int dh P(h) O
$$
thus after replacing every averaged over exponential we get
$$
p(y,t|x)=\int Dz Dk Dh\, e^{i\int dt \,k\dot{z}} P[h_t]
$$
and when you perform the integrations over the paths of $k$ and $h$ you will get your path integral.
Path integral from Fokker-Planck
I am less sure about how to go about directly showing the equivalence without the presence of a Langevin equation. However if we create a Lagrangian by multiplying the F-P equation by an auxiliary field and then integrate over all paths of our probability density and our auxiliary field, I have a feeling this will work but I am not certain so I will leave my answer here.
The main point of this is to show that the path integral and the FP equation are essentially two different representations of the Chapman-Kolmogorov equation. As far as I can tell they are basically interchangeable and I hope someone more knowledgeable than myself can come in and explain why you would use one over the other in certain situations.
I hope this has been useful!
Best Answer
"Sum over paths" is 150% correct. It, in fact, means product over integrals over each point of a path.
A picture should be worth a thousand words. Discretize the integral by two screens, with 3 and 4 slits, respectively. Ultimately, you'll consider the limit of an infinity of screens, each with an infinity of slits.
Think of your assignment to go from A to B in straight segments through one of the 3 slits at t1, and then the 4 at t2. You have then 3×4=12 options, 12 paths, 12 histories to achieve that. Your first (topmost) sum will then have 12 terms of exponentials of $iS_1, ..., iS_{12}$, the sum of 12 complex amplitudes, each corresponding to a different path. But each path/amplitude is a product of three segments, characterized by their endpoint: that 's why we are skipping the last factor of the segment abutting at B.
Now, each path/history is characterized by the location of its slit on each screen; e.g., calling each slit location 1,2,3, and 1,2,3,4, respectively, $$ S_i=\int dt ~ L_i(x(t)) \leftarrow L(x_1(i)) + L(x_2(i)) \leadsto \\ S_1 = L(x_1=1) +L(x_2=1) , \\ S_2 = L(x_1=2) +L(x_2=1) , \\ S_3 = L(x_1=3) +L(x_2=1) , \\ S_4 = L(x_1=1) +L(x_2=2) ,... \\ S_{12} = L(x_1=3) +L(x_2=4) . $$
The sum of the 12 exponentials then amounts to $$ \sum_{x_1} \sum_{x_2} e^{i L(x_1) +iL(x_2)}= \sum_{x_1} \sum_{x_2} e^{i L(x_1)} e^{iL(x_2)}, $$ a sum over 12 paths.
It should then be evident how to generalize to the continuum path integral.
First, still keeping just two screens, but perforating them with an infinity of slits to the point of evanescence, you get $$ \sum_{x_1} \to \int dx_1 , \sum_{x_1} \to \int dx_1 . $$
You then take the number of screens to N and then infinity, so $$ L(x_1) +L(x_2) \to L(x_1) +L(x_2)+...+ L(x_N) \to \int dt ~L(x(t)), $$ which, times i, is exponentiated to provide the integrand of the functional integral, now over an infinity of screens, $\int dx_1 \int dx_2 ...\int dx_N \to \iint \cal Dx $.
In sum, $$ \iint \cal{D} x ~~ e^{i\int dt ~L(x(t))}. $$ You might review the free particle to fix your notation.