This is a very good question, but it is really two completely different questions in one.
Feynman's propagator
The probability amplitude for a photon to go from x to y can be written in many ways, depending on the choice of gauge for the electromagnetic field. They all give the same answer for scattering questions, or for invariant questions involving events transmitted to a macroscopic measuring device, but they have different forms for the detailed microscopic particle propagation.
Feynman's gauge gives a photon propagator of:
$P(k) = {g_{\mu\nu} \over w^2 - k^2 + i\epsilon}$
And it's Fourier transform is
$2\pi^2 P(x,t) = {g_{\mu\nu} \over {t^2 - x^2 + i\epsilon}}$
This is the propgation function he is talking about. It is singular on the light cone, because the denominator blows up, and it is only this singularity which you can see as propagating photons for long distances. For short distances, you see a $1/s^2$ propagation where s is the interval or proper time, between source and sink.
To show that you recover only physical light modes propagating, the easiest way is to pass to Dirac gauage. In this gauge, electrostatic forces are instantaneous, but photons travel exactly at the speed of light. It is not a covariant gauge, meaning it picks a particular frame to define instantaneous.
The issues with the Feynman gauge is that the propagator is not 100% physical, because of the sign of the pole on the time-time component of the photon propagator. You have to use the fact that charge is conserved to see that non-physical negative-coefficient-pole states are not real propagating particles. This takes thinking in the Feynman picture, but is not a problem in the Dirac picture. The equivalence between the two is a path integral exercise in most modern quantum field theory books.
Fermat's principle and Lagrangians
Fermat's principle, as you noted, is not a usual action principle because it doesn't operate at fixed times. The analog of the Fermat principle in mechanics is called the principle of Maupertuis. This says that the classical trajectory is the one which minimizes
$$J = \int p dx = \int \sqrt{2m(E-V(x))} dx$$
between the endpoints. This principle is also timeless, and it can be used to construct an approximate form for the time Fourier transform of the propagator, and this is called the Gutzwiler trace formula.
the Gutzwiller trace formula is the closest thing we have to a proper quantum analog of the Maupertuis principle at this time.
Lagrangian for light
The analog of the Lagrangian principle for light is just the principle of that light travels along paths that minimize proper time, with the additional constraint that these proper times are zero.
The Lagrangian is
$ m\int ds = m\int \sqrt{1-v^2} dt$
but this is useless for massless particles. The proper transformation which gives a massless particle propagator is worked out in the early parts of Polyakov's "Gauge Fields and Strings" as a warm-up to the analogous problem for string theory. The answer is:
$ S= \int {\dot{x}^2\over 2} + m^2 ds$
The equivalence between this form and the previous one is actually sort of obvious in Euclidean space, because of the central limit theorem you must get falling Gaussians with a steady decay rate. Polyakov works it out carefully because the anlogous manipulations in string theory are not obvious at all.
The second form is not singular as m goes to zero, and gives the proper massless propagator. Transitioning between the two introduces an "einbein" along the path, a metric tensor in one dimension.
There is no experiment in which genuine information could be sent faster than light and there is no contradiction between this fact and quantum mechanics – as built by the Copenhagen school. Quite on the contrary, the proper, Copenhagen-like interpretation of quantum mechanics is needed for a description of known experiments that is compatible with special relativity and its most general consequences, locality and causality.
You would have to describe your experiment in detail if you wanted the interference and its disappearance to be discussed seriously.
However, quite generally, if there are entangled pairs produced, a single particle from this pair won't contribute to an interference pattern by itself. (A typical example is an entangled electron-photon pair where the electron participates in a double-slit experiment and the photon is used to "look" at the electron. The photon gets entangled with the electron but the electron's own interference pattern disappears.) The interference pattern may only be glimpsed if one compares some appropriate measured properties of both particles in the entangled pair. But that's only possible much later, when these results of measurements are communicated to a single place, and because the comparison occurs much later, it can't be used to transmit any information faster than light.
Best Answer
The paths of the Feynman path integral are not actually taken. The phrase "takes every possible path" is a mangled statement of the mathematical instruction to take the integral of $\exp(-\mathrm{i}S)$ over all possible paths for the action $S$ to get the probability amplitude of something happening. It is a fact of quantum mechanics that this integral computes the correct quantum mechanical amplitude, but the formalism of quantum mechanics never says anything about the particle "taking" these paths, which is in particular absurd because quantum objects are not point particles that have a well-defined path in the first place. So, well, you can say that it "takes" every possible path as long as you don't literally imagine a point particle zipping along each path. Which is what "taking" a path usually means. Which is why this figure of speech does not actually convey any physical insight.
The physical insight lies in understanding how the path integral reproduces the correct quantum mechanical amplitude, which cannot be done on the level of such crude heuristic statements based on classical notions of "path" and "particle". There is no path a quantum particle takes unless you continually track it, and then you'll get a perfectly ordinary classical path (see, for instance, the perfectly normal paths in bubble chambers, where the continual interaction with the bubble chamber effectively tracks the particle).