Yes, you have the right idea. You will want to learn about Fourier analysis, which lets you take a complicated-looking waveform like your third figure and analyze it to say "this is two sine waves, frequencies 1 and 5, equal amplitudes, zero relative phase."
I like to think of a piano as an inverse Fourier transform machine: you push the keys to tell the piano "please generate frequencies C, E, and G, with the C having larger amplitude than the others" and the piano makes the air vibrate for you. Your auditory system then does the ordinary Fourier transformation: with ear training, you can take those vibrations and say "Oh, a major triad, with a strong root."
The metric reported by a sound level meter is known as the 'equivalent sound pressure level', or Leq, which is really just the integral of squared pressure values over the measurement period divided by the total length of the measurement period:
$$L_{eq} = 10 \ log_{10} \biggr (\frac{1}{T} \int_0^T \frac{p^2(t)}{p_0^2} dt \biggl )$$
Where $T$ is the measurement duration, $p(t)$ is the sound pressure value at time $t$, and $p_0$ is the reference pressure in air, $2 \times 10^{-5} Pa$.
Note that this measure is an RMS acoustic pressure, not acoustic intensity, which is "the power carried by sound waves per unit area in a direction perpendicular to that area". To measure intensity you need an intensity probe (see pg. 13 of linked pdf).
Let's clarify one point before we continue. When you say:
the amplitude was set at a constant
I understand that to mean that the peak amplitude of the waveform was constant.
Given constant peak amplitude, and the fact we know the nature of $p(t)$ for each waveform we can immediately intuit which one will have the highest value of $L_{eq}$.
![5 Hz waveforms](https://i.stack.imgur.com/2Qm6U.png)
Square > Sine > Sawtooth. It's easy to see that the square wave spends almost all the time at the highest pressure value. The sine wave rolls off slower towards the maximum than the sawtooth function.
Using Adobe Audition, we can generate all three signals with identical peak amplitudes. I chose -12.0 dBFS. Then we can use the 'Amplitude Statistics' tool to report Total RMS Amplitude, and unsurprisingly we find:
Square (-12.0 dBFS), Sine (-15.0 dBFS), Sawtooth (-16.8 dBFS)
Your sound level meter application gave the same order:
Square (95 dB), Sine (94 dB), Sawtooth (89 dB)
Best Answer
The beats are audible at lower frequencies because your ears do in fact pick up phase information, but only at these lower frequencies.
When a sound enters our ear, we magnify it via mechanical oscillations of bones and hydraulic effects, ultimately causing vibration in a thin film in our inner ear called the basilar membrane. Different sections of the basilar membrane will vibrate in response to different tones. The basilar membrane is connected to thousands of small hairs, themselves connected to mechanically-sensitive ion gates. Oscillations of these hair then trigger the ion gates. The ion gates send electrical impulses down neurons to our brains.
Empirically, it is observed that these nerve impulses almost always begin at the peak amplitude of a vibration of the basilar membrane. Thus, if our two ears receive sound with different phase, they will fire nerve impulses at different times, and our brains will have access to phase information.
An interesting demonstration of this was given by Lord Raleigh in 1907. He theorized that phase difference detection between the ears was a key component to our ability to localize sound. When Raleigh played two tuning forks that were slightly out of tune, so that the phase oscillated, his found that human perception of the location of the sound oscillated from the left to the right of the listener's head.
At high frequencies, we lose phase information. This is because of uncertainties in the exact time of arrival of a nerve impulse. A typical nerve impulse lasts several milliseconds, so above 1000 Hz the uncertainty in arrival time becomes comparable to the frequency itself, meaning we lose phase information. It turns out that we mostly lose the ability to localize sound in the range 1000 - 3000 Hz. Above 3000 Hz, different physiological mechanisms related to the "shadow" of your head allow us to localize sound again.
Reference:
http://en.wikipedia.org/wiki/Action_potential
The information about Rayleigh's experiment and firing at the peak of oscillations is from chapter 5 of "The Science of Sound" by Rossing, Wheeler, and Moore.