I would boldly claim that this thought experiment (also known as the Heisenberg microscope) is simply the wrong picture to understand the origin of uncertainty principle. The reason why it is so is because it mixes up between uncertainty due to measurement and uncertainty due to quantum state; nonetheless it had made its way into numerous textbooks and confused numerous undergraduates (including me) by including quantum mechanical objects such as electrons and photons and giving some results that has the factor $\hbar$ in it.
I will try to explain this confusing business to the best of my abilities about your questions in three parts - firstly, what is Heisenberg uncertainty principle; secondly, why is it unique to quantum mechanics; and finally, why the Heisenberg microscope is a wrong way of understanding the uncertainty principle. I am sorry that I may have to include a bit of maths from time to time, but I hope you will follow (and I hope I am right about this - do comment if I made mistakes).
Firstly, what is uncertainty principle? The best way that I know of to understand it physically is the following scenario: imagine that you have prepared a huge quantity of identical quantum states, and you measured the position of half of these states and the momentum of the rest with perfect precision (see below). At the end of the day, you will obtain a list of positions and momenta, you will notice that these results do have uncertainties due to the probabilistic nature of quantum states.
Here is where the uncertainty principle kicks in: regardless of what quantum state you prepared in the first place, if you calculate the uncertainties of positions and momenta respectively by the data you obtained from that long list, it will always be the case the uncertainties calculated from the list obey the uncertainty principle $\Delta x \Delta p \geq \hbar/2$. A more interesting way of rephrasing it would be you can never prepare a quantum state of which the uncertainties calculated from the list $\Delta x \Delta p$ is smaller than $\hbar/2$.
Before moving on, it is worth discussing a few things in this imaginary scenario. First thing is obviously what do I mean by the phrase with perfect precision? I certainly do not mean that there is some 'position' and 'momentum' that the quantum state has prior to measurement, what I meant is that the measurement results are completely due to the quantum states themselves, and are subjected to no external disturbance by other physical systems. Well you may argue that it is physically impossible to do that for any experimental apparatus would introduce some perturbation of the system, but since we are living in the imaginary thought experiment well we get to decide what we can do and what we can't do.
And here's the point which is very important under the context of the problem: even in this ideal world we can obtain positions and momenta directly from the quantum states, the uncertainty principle still holds. Throw away apparatus like the microscope or any other fancy equipments, you still have uncertainty - and this property is fundamentally due to the nature of quantum states themselves.
Still need convincing why this is justified? Well here we enter the second part on uncertainty due to measurements. Look back to any experiments with classical systems - you can almost certainly find no experiments where there is 0% uncertainty as there are bound to be errors introduced by the environment; nonetheless it doesn't stop us from imagining a perfect experiment where the results are completely due to the physical system we are studying. Say you are measuring acceleration due to gravity in a lab - you can be certain that almost nobody will ever get $9.80665$ meter per second squared (unless you are a cheater) because of errors due to gravitational attraction to the surrounding objects, the grids on your ruler are not fine enough, etc. etc. But you have no problem convincing yourself that under the perfect and ideal condition you still will be able to get $9.80665$.
And the crux of the matter here is that the uncertainty due to environment (or errors) happens to all systems, be it classical or quantum. Nonetheless, the uncertainty principle only applies to quantum systems. In Newtonian mechanics, you can characterise the motion of a particle in one dimension by a pair of quantities $(x,p)$, or position and momentum, and you can make it such that following the experimental procedure we described in part 1, that by preparing a huge number of identical states and measure their positions and momenta, $\Delta x \Delta p \leq \hbar/2$. In fact, it is very easy: by preparing a bunch of particles having same position and momentum, $\Delta x = \Delta p = 0$. But in quantum mechanics, it simply cannot be done, because we are talking about an entirely different beast here: instead of $(x,p)$, you need to describe a quantum state with a wavefunction $|\psi\rangle$, and they must obey the uncertainty principle.
So here we are at the third part - why is the Heisenberg microscope the wrong picture to understand the origin of uncertainty principle. I suspect that you can now already answer that - the thought experiment basically attributes the origin of the uncertainty principle to error introduced in the experiment, but not the quantum state itself. In a perfect experiment, according to Heisenberg microscope, there will be no uncertainty; we can even try to perceive measuring the position and momentum of the electron using other methods - say shooting one electron off a gun and bouncing them off by a wall (maybe?) - that can give you uncertainties below the uncertainty principle according to the picture described by Heisenberg microscope. But this is simply not the case and you simply can't do that - because the state is described by a wavefunction $|\psi \rangle$, but not a pair of $(x,p)$, so it is simply wrong to use '$x$' or '$p$' to describe the electron.
This also leads to the complication about interactions, as you have mentioned in your question. The interaction between photon and electron cannot be simply described by 'momentum transfer' for this implicitly assumes that the physical state photons and electrons are characterised by some momenta. As stated before, the interaction can only be described in terms of $|\psi\rangle$; and to be absolutely strict the best way of understanding such interaction is from QED, rather than this semi-classical picture. Nonetheless, let me reiterate my statement again - remove the interaction (regardless of whether it is photon-electron interaction or whatever physical processes you use to probe the electron), you still have uncertainty principle, because it is a fundamental property of a quantum state.
Regardless, I suspect the reason why the Heisenberg microscope is so successful is because the way it mixes quantum mechanical interactions between electrons and photons and classical interpretations to give results involving the infamous $\hbar$ simply by manipulating the errors introduced in an experiment, which gives us the illusion that we can intuitively understand the uncertainty principle and it is simply not the case. I feel it's fitting to use this (mis)quote - certainly quantum mechanics has never allowed herself to be won; and at present every kind of intuition stands with sad and discouraged mien—IF, indeed, it stands at all! - but this is, I guess, why we love quantum mechanics so much :)
The best intuitive analogy I've heard is with classical sound waves. Consider a musical instrument playing a pure sine wave of frequency $\nu$ and amplitude $A$, and no other harmonic frequencies at all. Graphing this in frequency-amplitude space ($x$-axis=frequency, $y$=amplitude) gives you a $\delta$-function-like point function with value $y=A$ at $x=\nu$, and zero everywhere else. That represents your exact knowledge of the note's frequency.
But at what time was the note played? A pure sine wave extends from $-\infty<t<\infty$. Any attempt to play a shorter note necessarily introduces additional components/harmonics in its Fourier decomposition. And the shorter the interval $t_0<t<t_1$ you want, the broader your frequency spectrum has to become. Indeed, imagine an instantaneous sound. Neither your ear, nor any apparatus, can say anything about its frequency at all -- you'd have to sense some finite portion of the waveform to analyse its shape/components, but "instantaneous" precludes that.
So, you can't simultaneously know both a note's frequency and the time it's played, due to the Fourier conjugate nature of frequency/time. The better you know one, the worse you know the other. And, as @annav mentioned, that's analogous to the nature of conjugate quantum observables.
Edit:
to address @sanchises remark about some "crude MSPaint drawings"...
For simplicity (i.e., my own simplicity generating the following "crude drawings"), I'm illustrating an almost-square wave below, rather than a sine wave. Suppose you wanted to produce a sound wave with a one-cycle duration, looking something like,
So the "tails" are zero in both directions, indicating the sound's finite duration. But if we try generating that with just two fourier components, we can't get those zero-tails. Instead, it looks like,
As you see, we can't "localize" the sound's duration with just two frequencies. To get a better approximation, four components looks like,
And that still fails to accomplish much by way of "localization".
Next, eight components looks like,
And that's beginning to exhibit the behavior we're looking for.
Sixteen looks like,
And I could go on. The initial illustration above was generated with 99 components, and looks pretty much like the intended square wave.
Comment:
you guys coincidentally stepped into one of my little programs when mentioning drawings. See http://www.forkosh.com/onedwaveeq.html for a discussion, although not about uncertainty. To get the above illustrations, I used the following parameters in that "Solver Box" at top,
nrows=100&ncols=256&ncoefs=99&fgblue=135&f=0,0,0,0,0,0,1,1,1,1,1,-1,-1,-1,-1,-1,0,0,0,0,0,0,0>imestep=1&bigf=1
Just change the ncoefs=99 to generate the corresponding drawings above.
Best Answer
Heisenberg's uncertainty principle is
$$\Delta x \Delta p \geq \hbar/2.$$
Since the well is of width $L$, you have a measure for the uncertainty on the position $\Delta x$. Then assume the lowest possible value for $\Delta p$, i.e. the one for which the above inequality becomes an equality. Lastly, use $E = \dfrac{p^2}{2m}$ to find an expression for $E$.
A useful question to look at as well might be this one.