We could spend forever playing whac-a-mole with all of the confusing/confused statements that continue popping up on this subject, on PhysicsForums and elsewhere. Instead of doing that, I'll offer a general perspective that, for me at least, has been refreshingly clarifying.
I'll start by reviewing a general no-go result, which applies to all relativistic QFTs, not just to photons. Then I'll explain how the analogous question for electrons would be answered, and finally I'll extend the answer to photons. The reason for doing this in that order will probably be clear in hindsight.
A general no-go result
First, here's a review of the fundamental no-go result for relativistic QFT in flat spacetime:
In QFT, observables are associated with regions of spacetime (or just space, in the Schrödinger picture). This association is part of the definition of any given QFT.
In relativistic QFT, the Reeh-Schlieder theorem implies that an observable localized in a bounded region of spacetime cannot annihilate the vacuum state. Intuitively, this is because the vacuum state is entangled with respect to location.
Particles are defined relative to the vacuum state. By definition, the vacuum state has zero particles, so the Reeh-Schlieder theorem implies that an observable representing the number of particles in a given bounded region of spacetime cannot exist: if an observable is localized in a bounded region of spacetime, then it can't always register zero particles in the vacuum state.
That's the no-go result, and it's very general. It's not restricted to massless particles or to particles of helicity $\geq 1$. For example, it also applies to electrons. The no-go result says that we can't satisfy both requirements: in relativistic QFT, we can't have a detector that is both
But here's the important question: how close can we get to satisfying both of these requirements?
Warm-up: electrons
First consider the QFT of non-interacting electrons, with Lagrangian $L\sim \overline\psi(i\gamma\partial+m)\psi$. The question is about photons, and I'll get to that, but let's start with electrons because then we can use the electron mass $m$ to define a length scale $\hbar/mc$ to which other quantities can be compared.
To construct observables that count electrons, we can use the creation/annihilation operators. We know from QFT $101$ how to construct creation/annihilation operators from the Dirac field operators $\psi(x)$, and we know that this relationship is non-local (and non-localizable) because of the function $\omega(\vec p) = (\vec p^2+m^2)^{1/2}$ in the integrand, as promised by Reeh-Schlieder.
However, for electrons with sufficiently low momentum, this function might as well be $\omega\approx m$. If we replace $\omega\to m$ in the integrand, then the relationship between the creation/annihilation operators becomes local. Making this replacement changes the model from relativistic to non-relativistic, so the Reeh-Schlieder theorem no longer applies. That's why we can have electron-counting observables that satisfy both of the above requirements in the non-relativistic approximation.
Said another way: Observables associated with mutually spacelike regions are required to commute with each other (the microcausality requirement). The length scale $\hbar/mc$ is the scale over which commutators of our quasi-local detector-observables fall off with increasing spacelike separation. Since the non-zero tails of those commutators fall off exponentially with characteristic length $\hbar/mc$, we won't notice them in experiments that have low energy/low resolution compared to $\hbar/mc$.
Instead of compromising strict localization, we can compromise strict reliability instead: we can construct observables that are localized in a strictly bounded region and that almost annihilate the vacuum state. Such an observable represents a detector that is slightly noisy. The noise is again negligible for low-resolution detectors — that is, for detector-observables whose localization region is much larger than the scale $\hbar/mc$.
This is why non-relativistic few-particle quantum mechanics works — for electrons.
Photons
Now consider the QFT of the elelctromagnetic field by itself, which I'll call QEM. All of the observables in this model can be expressed in terms of the electric and magnetic field operators, and again we know from QFT $101$ how to construct creation/annihilation operators that define what "photon" means in this model: they are the positive/negative frequency parts of the field operators. This relationship is manifestly non-local. We can see this from the explicit expression, but we can also anticipate it more generally: the definition of positive/negative frequency involves the infinite past/future, and thanks to the time-slice principle, this implies access to arbitrarily large spacelike regions.
In QEM, there is no characteristic scale analogous to $\hbar/mc$, because $m=0$. The ideas used above for electrons still work, except that the deviations from localization and/or reliability don't fall off exponentially with any characteristic scale. They fall of like a power of the distance instead.
As far as this question is concerned, that's really the only difference between the electron case and the photon case. That's enough of a difference to prevent us from constructing a model for photons that is analogous to non-relativistic quantum mechanics for electrons, but it's not enough of a difference to prevent photon-detection observables from being both localized and reliable for most practical purposes. The larger we allow its localization region to be, the more reliable (less noisy) a photon detector can be. Our definition of how-good-is-good-enough needs to be based on something else besides QEM itself, because QEM doesn't have any characteristic length-scale of its own. That's not an obstacle to having relatively well-localized photon-observables in practice, because there's more to the real world than QEM.
Position operators
What is a position operator? Nothing that I said above refers to such a thing. Instead, everything I said above was expressed in terms of observables that represent particle detectors (or counters). I did that because the starting point was relativistic QFT, and QFT is expressed in terms of observables that are localized in bounded regions.
Actually, non-relativistic QM can also be expressed that way. Start with the traditional formulation in terms of the position operator $X$. (I'll consider only one dimension for simplicity.) This single operator $X$ is really just a convenient way of packaging-and-labeling a bunch of mutually-commuting projection operators, namely the operators $P(R)$ that project a wavefunction $\Psi(x)$ onto the part with $x\in R$, cutting off the parts with $x\notin R$. In fancy language, the commutative von Neumann algebra generated by $X$ is the same as the commutative von Neumann algebra generated by all of the $P(R)$s, so aside from how things are labeled with "eigenvalues," they both represent the same observable as far as Born's rule is concerned. If we look at how non-relativistic QM is derived from its relativistic roots, we see that the $P(R)$s are localized within the region $R$ by QFT's definition of "localized" — at least insofar as the non-relativistic approximation is valid. In this sense, non-relativistic single-particle QM is, like QFT, expressed in terms of observables associated with bounded regions of space. The traditional formulation of single-particle QM obscures this.
Here's the point: when we talk about a position operator for an electron in a non-relativistic model, we're implicitly talking about the projection operators $P(R)$, which are associated with bounded regions of space. The position operator $X$ is a neat way of packaging all of those projection operators and labeling them with a convenient spatial coordinate, so that we can use concise statistics like means and standard deviations, but you can't have $X$ without also having the projection operators $P(R)$, because the existence of the former implies the existence of the latter (through the spectral theorem or, through the von-Neumann-algebra fanciness that I mentioned above).
So... can a photon have a position operator? If by position operator we mean something like the projection operators $P(R)$, which are both (1) localized in a strictly bounded region and (2) strictly reliable as "detectors" of things in that region, then the answer is no. A photon can't have a position operator for the same reason that a photon can't have a non-relativistic approximation: for a photon, there is no characteristic length scale analogous to $\hbar/mc$ to which the size of a localization region can be compared, without referring to something other than the electromagnetic field itself. What we can do is use the usual photon creation/annihilation operators to construct photon-detecting/counting observables that are not strictly localized in any bounded region but whose "tails" are negligible compared to anything else that we care about (outside of QEM), if the quasi-localization region is large enough.
What is a physical consequence?
What is a physical consequence of the non-existence of a strict position operator? Real localized detectors are necessarily noisy. The more localized they are, the noisier they must be. Reeh-Schlieder guarantees this, both for electrons and for photons, the main difference being that for electrons, the effect decreases exponentially as the size of the localization region is increased. For photons, it decreases only like a power of the size.
Best Answer
The relation $p={h\over \lambda}$ applies to photons, it has nothing to do with the uncertainty principle. The issue is localizing the photons, finding out where the are at any given time.
The position operator for a photon is not well defined in any usual sense, because the photon position does not evolve causally, the photon can go back in time. The same issue occurs with any relativistic particle when you try to localize it in a region smaller than its Compton wavelength. The Schrodinger position representation is only valid for nonrelativistic massive particles.
There are two resolutions to this, which are complementary. The standard way out it to talk about quantum fields, and deal with photons as excitations of the quantum field. Then you never talk about localizing photons in space.
The second method is to redefine the position of a photon in space-time rather than in space at one time, and to define the photon trajectory as a sum over forward and backward in time paths. This definition is fine in perturbation theory, where it is an interpretation of Feynman's diagrams, but it is not clear that it is completely correct outside of perturbation theory. I tend to think it is fine outside of perturbation theory too, but others disagree, and the precise nonperturbative particle formalism is not completely worked out anywhere, and it is not certain that it is fully consistent (but I believe it is).
In the perturbative formalism, to create a space-time localized photon with polarization $\epsilon$, you apply the free photon field operator $\epsilon\cdot A$ at a given space time point. The propagator is then the sum over all space-time paths of a particle action. The coincidence between two point functions and particle-paths This is the Schwinger representation of Feynman's propagator, and it is also implicit in Feynman's original work. This point of view is downplayed in quantum field theory books, which tend to emphasize the field point of view.