It isn't possible to create an audio source in mid-air using the method you've described. This is because the two ultrasonic waves would create an audible source if the listener were standing at that spot, but those waves would continue to propagate in the same direction afterwards. You would need, as I point out below, some sort of medium which scattered the waves in all directions to make it seem as if the sound were coming from the point at which you interfered the two waves.
It is possible, however, to make the user percieve the sound as coming from a specific location, but it isn't as easy as the author makes it seem. I can think of two different ways. First of all, as described by @reirab, you can get audio frequencies by interfering two sound waves of high frequency. When they interfere they will generate a beat note which has the frequency of the difference between the two frequencies. I.E. if you send a sound beam with frequency $f_1=200\ \text{kHz}$ and another beam with $f_2=210\ \text{kHz}$, the frequency heard in the region where they combine will be $\Delta f-=f_2-f_1=10\ \text{kHz}$ which is in the audio band of humans.
There is an additional difficulty. You will need the sound to come out in a well-defined, narrow (collimated) beam, and this is not terribly easy to do. A typical speaker emits sound in all directions. There are many techniques for generating such beams, but one is to use a phased array.
How can you use this to make a person perceive the sound as coming from a specific point?
Sending Two Different Volumes to the Two Ears
What does it mean to perceive sound as coming from a specific location? Our ears are just microphones with cones which accept sound mostly from one direction (excepting low frequencies). A large part of the way we determine where the sound came from is just the relative volume in our two ears. So, you could use the interference effect described above with beams which are narrow enough that you can target each ear. By using two separate sets of beams targeting each ear with different volumes, you could make the person perceive the sound as coming from a specific location; at least as well as a 3D movie makes a person perceive images in 3D.
Hitting a Material Which Scattered the Sound Isotropically
The second method is to use the same interference effect, but this time combining the two beams at a point where a material scattered the sound waves in all directions. I'm going to be honest, I'm not sure how realistic such materials are, but lets assume they exist for now. If you did so, the two sound beams would be scattered with equal amplitude in all directions and the person you are trying to fool would percieve the sound as coming from this point. This method has the advantage of truly sounding to the person as if the sound came from that direction in all respects including reflections, phasing, etc.
In summary, the idea is definitely possible (maybe there are more ways than I've given), but it isn't as simple as the passage in the book makes it out to be.
Update with a more clear answer:
Here's a plot of all the velocities involved with shock propagation through a sationary medium:
The x axis is the mach number of the shock wave and represents the strength of the shock wave, it could have been velocity or pressure ratio or any other quantity that is monotonic with shock strength.
The y-axis is velocity relative to the still air.
- In solid red we have the velocity of the air entering the shock wave which in this reference frame is still, and thus 0.
- In solid blue we have the velocity of the shock wave.
- In solid green we have the velocity of the air after exiting the shock wave.
In dashed lines I've added to the graph the maximum and minimum velocities that a sound wave could travel (moving with the shock and opposite the shock respectively), but the velocity of a sound wave is relative to the average velocity of the medium it is traveling through, so I've added these line colored according to the medium they are traveling through.
As noted by the OP and the quotation above the velocity of the shock (in blue) is always higher than the velocity of sound in the entering air. However, it's always less than the forward velocity of sound in the exiting medium.
Thus a pressure wave generated by a plane increasing in velocity can propagate to catch up to the shock wave and push it to go even faster. Similarly, if the plane slows down the lower pressure wave can also catch up to the shock wave and slow it down. This is the same propagation mechanism as in longitudinal sound waves.
The fact that the shock wave is traveling faster than sound in the sill medium isn't a problem because the shock wave is being generated and pushed forward by the exiting medium, and relative to the exiting medium the shock wave is traveling at less than the speed of sound.
Change in speed of sound
The fact that the speed of sound changes across the shock wave is irrelevant to this analysis. It was accounted for in the creation of the graph as can be seen by the green dashed lines diverging. However, even if they had not diverged at all the shock wave would still be within the speed of sound in the exiting medium. Similarly, if the speed of sound of the exiting medium was applied to the entering medium, the shock will still fall outside that speed of sound. (Doing this doesn't make physical sense, but is just to demonstrate that the change of speed of sound is irrelevant to answering the question.)
Sudo speeds of sound are dotted (sound velocities traveling in the opposite direction as the shock have been removed for clarity):
Old Answer
Sound waves travel at the speed of sound relative to the the average velocity of the medium. In the case of a shock wave, the time average velocity of the medium is different on the two sides of the shock wave.
Shock Wave's Perspective
In the frame of reference where the shock wave is stationary, entering medium travels towards the shock wave at super sonic speeds, and exiting medium travels away from the shock wave at sub sonic speeds.
This is the usual frame of reference used to analyze shock waves and is used in shock tables
Exiting Medium's Perspective
In the frame of reference of the exiting medium the shock wave travels outward at sub sonic speeds and the entering medium travels inward at super sonic speeds.
Entering Medium's Perspective
Finally, in the frame of reference of the entering medium, the shock wave travels inward at super sonic speed, and the exiting medium exits at a lesser super sonic speed.
This is the frame of reference used in the article as the entering fluid is the atmosphere that the plane is flying through and is thus the assumed rest frame.
Conclusion
The shock wave travels at the speed of sound relative to a weighted average of medium velocity, and is thus not an exception to the rule that wave travel at the speed of sound relative to the average velocity of the medium.
Note that the speed of sound does depend on temperature, and that the temperature changes across a shock wave $a=\sqrt{\gamma\,R\,T}$. However, this effect is not as large as the velocity differences due to the change in reference frame. The figures provided above are to scale using Mach numbers for an entering speed of mach 5. Thus those arrow ignore the change in speed of sound. However, if the changes in speed of sound were accounted for my conclusion would still holds.
Additionally, for high mach numbers the high temperature will cause deviation in the ratio of specific heats resulting in a more complex formula for the speed of sound:
$$a = \sqrt{ R * T * \left(1 + \frac{\gamma - 1}{ 1 + (\gamma-1) * \frac{(\theta/T)^2 * e^{\theta/T} }{\left(e^{\theta/T} -1\right)^2}} \right)}$$
This compensation will actually decrease the amount that the speed of sound is effected by the change in temperature.
Best Answer
First, some fraction of incident sound power will pass thru any object, human body or not, so your title makes no sense. The real question is how much this body will attenuate the frequencies of interest. Below some attenuation, you either don't care or it's below the noise floor of the sensors to detect it. However, this depends on what you care about, how good your sensors are, and how much ambient noise there is, so there is no single yes/no answer.
The other part of what you are missing is that "blocking" by your hand isn't the issue. A human body, even just a hand, will likely attenuate to oblivion audio frequencies coupled air to body and then back out body to air. However, that won't prevent you hearing the sound in the setup you describe. That's because of diffraction around the edges of this body, even if you assume it absorbes everything incident upon it.
If you figure audio frequencies are from 20 Hz to 20 kHz, and sound propagates thru normal air at about 3 ms per meter, then the wavelengths are 17 m to 17 mm. Even at 17 mm, there will be easily audible diffraction that will go around the edges of the blocking object. You will be able to hear significant attenuation as you hold your hand between your ear and the source, but you will still be able to hear the original.
Let's say your hand is about 100 mm across. 100 mm sound wavelength means about 3.3 kHz frequency. That's roughly the limit below which the hand will have diminishing affect. At 300 Hz, for example, you probably won't notice any affect from puting your hand in the way when it's arm's length from your head and the sound source. At 1/10 wavelength, the sounds waves will mostly just go around it without getting attenuated.