According to the American Meteor Society, the sonic boom of an asteroid or meteor (sometimes referred to as a 'fireball') is due to
If a very bright fireball, usually greater than magnitude -8, penetrates to the stratosphere, below an altitude of about 50 km (30 miles), and explodes as a bolide, there is a chance that sonic booms may be heard on the ground below. This is more likely if the bolide occurs at an altitude angle of about 45 degrees or so for the observer, and is less likely if the bolide occurs overhead (although still possible) or near the horizon.
And from CalTech's CoolCosmos page
When an object travels faster than the speed of sound in Earth's atmosphere, a shock wave can be created that can be heard as a sonic boom.
The reason for asteroids causing sonic booms in the lower atmosphere, is according to the article How the Falling Meteor Packed a Sonic Punch (Klotz, 2013) is due to
Because the meteor is supersonic, the waves, which travel at the speed of sound, can’t get out of the way fast enough. The waves build up, compress and eventually become a single shock wave moving at the speed of sound.
Looking a bit further in to what a sonic boom (Using a jet as an example) is and how it occurs is illustrated in the following diagram
Image source
So, if a meteor, asteroid is going faster than the speed of sound for particular part of the atmosphere, then a sonic boom will occur. Going back to the American Meteor Society's description of the likely cause of a sonic boom, they stated that if a meteor comes in
below an altitude of about 50 km (30 miles)
then a sonic boom is likely to occur, one of the reasons is that the speed of sound is slower, due to the temperature of the atmosphere at that height and lower. Below is a graph showing the speed of sound plotted against temperature as a function of atmospheric elevation:
Image source.
It isn't possible to create an audio source in mid-air using the method you've described. This is because the two ultrasonic waves would create an audible source if the listener were standing at that spot, but those waves would continue to propagate in the same direction afterwards. You would need, as I point out below, some sort of medium which scattered the waves in all directions to make it seem as if the sound were coming from the point at which you interfered the two waves.
It is possible, however, to make the user percieve the sound as coming from a specific location, but it isn't as easy as the author makes it seem. I can think of two different ways. First of all, as described by @reirab, you can get audio frequencies by interfering two sound waves of high frequency. When they interfere they will generate a beat note which has the frequency of the difference between the two frequencies. I.E. if you send a sound beam with frequency $f_1=200\ \text{kHz}$ and another beam with $f_2=210\ \text{kHz}$, the frequency heard in the region where they combine will be $\Delta f-=f_2-f_1=10\ \text{kHz}$ which is in the audio band of humans.
There is an additional difficulty. You will need the sound to come out in a well-defined, narrow (collimated) beam, and this is not terribly easy to do. A typical speaker emits sound in all directions. There are many techniques for generating such beams, but one is to use a phased array.
How can you use this to make a person perceive the sound as coming from a specific point?
Sending Two Different Volumes to the Two Ears
What does it mean to perceive sound as coming from a specific location? Our ears are just microphones with cones which accept sound mostly from one direction (excepting low frequencies). A large part of the way we determine where the sound came from is just the relative volume in our two ears. So, you could use the interference effect described above with beams which are narrow enough that you can target each ear. By using two separate sets of beams targeting each ear with different volumes, you could make the person perceive the sound as coming from a specific location; at least as well as a 3D movie makes a person perceive images in 3D.
Hitting a Material Which Scattered the Sound Isotropically
The second method is to use the same interference effect, but this time combining the two beams at a point where a material scattered the sound waves in all directions. I'm going to be honest, I'm not sure how realistic such materials are, but lets assume they exist for now. If you did so, the two sound beams would be scattered with equal amplitude in all directions and the person you are trying to fool would percieve the sound as coming from this point. This method has the advantage of truly sounding to the person as if the sound came from that direction in all respects including reflections, phasing, etc.
In summary, the idea is definitely possible (maybe there are more ways than I've given), but it isn't as simple as the passage in the book makes it out to be.
Best Answer
First, I am going to provide a little background on equivalent pressures at different altitudes from Earth's surface.
Layers of Earth's Atmosphere
Troposphere to Mesosphere
At sea level, the neutral atmosphere of Earth has a pressure of ~$10^{5}$ Pa (or ~1000 mbars).
The below image from https://en.wikipedia.org/wiki/File:Comparison_US_standard_atmosphere_1962.svg shows the broad range of temperatures/pressures of Earth's atmosphere.
Ionosphere
The region where the atmosphere transitions from mostly neutral to mostly ionized gas (called a plasma) is called the ionosphere. The altitudes defining this region vary (due to solar variability), but are generally defined as ranging from ~60-1000 km. The free electron number density in the ionosphere varies greatly from ~$10^{3} - 10^{6}$ # $cm^{-3}$ (or number of particles per cubic centimeter). The temperature varies from few 100 K to ~1500 K. Thus, if we treated it like an ideal gas the thermal pressure of the charged particles would range from a few $10^{-12}$ Pa to few $10^{-8}$ Pa.
Thus, the ratio of sea level pressure to the plasma constituents would be ~$10^{13} - 10^{17}$.
Plasmasphere
The region immediately surrounding the ionosphere is called the plasmasphere, which can extend to altitudes as low as a few $R_{E}$ up to ~6 $R_{E}$. The density ranges from several 100 # $cm^{-3}$ down to ~10 # $cm^{-3}$ and temperatures vary greatly, from ~6000-35,000 K. Again, these ranges correspond to thermal pressures of $10^{-13}$ Pa to few $10^{-11}$ Pa.
Thus, the ratio of sea level pressure to the plasma constituents would be ~$10^{16} - 10^{18}$.
Outer Magnetosphere
The "best" vacuum that we can easily access is the Earth's outer magnetosphere, which has a density ranging from ~0.01-1.0 particles per cubic centimeter. The temperatures in the outer magnetosphere can vary greatly from ~$10^{5}$ K to greater than $10^{9}$ K (i.e., if one converts radiation belt particle energies, e.g., 100s of keV, to an equivalent temperature). Thus, the range of equivalent ideal gas thermal pressures would be few $10^{-14}$ Pa to a few $10^{-8}$ Pa.
Thus, the ratio of sea level pressure to the plasma constituents would be ~$10^{13} - 10^{19}$.
Solar Wind
The solar wind - the supersonic flow of plasma from the sun's upper atmosphere - has densities and temperatures ranging from ~0.5-50 # $cm^{-3}$ and ~$10^{4}$ K up to ~$10^{6}$ K, respectively (for references, see https://physics.stackexchange.com/a/179057/59023). Thus, the range of equivalent ideal gas thermal pressures would be few ~$10^{-14} - 10^{-10}$ Pa.
Thus, the ratio of sea level pressure to the plasma constituents would be ~$10^{15} - 10^{19}$.
Answers
Short Answer
For all practical purposes, there are no regions of space completely devoid of some kind of sound.
Long/Detailed Answer
Interestingly, one can have the traditional sound wave (i.e., a longitudinal oscillation mediated by gas particle collisions) propagate into the upper atmosphere and even into the ionosphere. The point where such a sound wave would experience strong damping is where the collisional mean free path becomes too large to support the oscillations, i.e., this would occur when the average time between collisions becomes comparable to the wave frequency. Thus, the oscillations would have no restoring force and would damp out (for an electromagnetic analogy, see evanescent waves).
No, there are sound waves that start in space can propagate in almost zero pressures, well, less than $10^{-14}$ Pa is as close to vacuum as one needs since the best vacuums we can produce in a lab is ~$10^{-11}$ Pa.
There are multiple sound waves in space, including ion acoustic waves and magnetosonic waves. Ion acoustic waves have been seen all the way out at Saturn, where the solar wind density and temperature can be as low as ~0.1 # $cm^{-3}$ and less than ~$10^{4}$ K, or thermal pressures below $10^{-14}$ Pa (note that the ram or dynamic pressure is generally ~2-4 orders of magnitude higher owing to the high speed of the solar wind). These modes have been observed as far out as Neptune, throughout the solar wind, and as in close as ~0.3 AU.
There is no reason to expect that such modes would not exist in the interstellar medium, where densities and temperatures can be as low as ~0.1 # $cm^{-3}$ and ~$10^{3}$ K, corresponding to $10^{-15}$ Pa.
The intra galaxy cluster medium is even more tenuous but much hotter, with densities and temperatures as low as ~$10^{-4}$ # $cm^{-3}$ and ~$10^{7}$ K, corresponding to $10^{-14}$ Pa (e.g., see arXiv e-print 1406.4410). Again, the ubiquity of ion acoustic waves in the interplanetary medium suggests we should expect these in nearly all regions of space.
Update/Edit
Let me ask the question in a slightly different manner.
The intensity of sound decreases as $I\left( r \right) \propto r^{-2}$ while sound pressure decreases as $P\left( r \right) \propto r^{-1}$. The hearing threshold is a function of frequency, because the human ear does not have a flat frequency response, but it is generally accepted as being ~20 $\mu$Pa for 1 atmosphere and 25$^{\circ} C$ at 1000 Hz. The sound pressure level (measured in dB) is given by: $$ L_{p}\left( r \right) = 20 \ \log_{10} \left( \frac{ P\left( r \right) }{ P_{o} } \right) $$ where we set $P_{o}$ ~ 20 $\mu$Pa. Then a 100 dB source corresponds to ~2 Pa at the source. This would drop to $P_{o}$ at a distance of ~$10^{5}$ m, ignoring any acoustic impedance or losses and assuming the pressure and temperature are the same as the reference parameters for $P_{o}$.
The reference sound intensity, $I_{o}$, depends upon the characteristic acoustic impedance, $z_{o}$, as $I_{o} = P_{o}^{2}/z_{o}$. We know that $z_{o} = \rho \ C_{s}$, where $\rho$ is the mass density and $C_{s}$ is the speed of sound. We can model $\rho = \rho \left( h \right)$ using a known atmospheric scale height and an exponential decay (which reproduces the orange line the figure above) and take a set of values from the blue in the figure above for $C_{s}$ (see table below). Then we find that $z_{o}$ ranges from ~0.003-416 Pa s/m from 0-100 km altitude. If we use the human hearing threshold for $P_{o}$, then $I_{o}$ ranges from ~$10^{-13} - 10^{-7}$ W $m^{2}$.
Since $I_{o}$ increases as we increase altitude, then the intensity of our source would have to increase as well to maintain its initial $L_{o}$ = 100 dB level (i.e., $I_{src}\left( h \right) = I_{o}\left( h \right) 10^{L_{o}/10}$). The intensity at the source, $I_{src}$, then ranges from ~0.01-1500 W $m^{2}$.
Let's assume we use the same intensity at sea level and bring the speaker up in altitude, then the intensity level at the source drops with increasing altitude as: $$ L_{i,src}\left( h \right) = 10 \ \log_{10} \left( \frac{ I_{src}\left( 0 \right) }{ I_{o}\left( h \right) } \right) $$ Then $L_{i,src}$ varies from 100 dB at sea level to ~94 dB by 10 km, ~79 dB by 50 km, and ~48 dB by 100 km.
We estimate the intensity level at a given distance away from the source as: $$ L_{r}\left( h, r \right) = L_{i,src}\left( h \right) + 20 \ \log_{10} \left( \frac{ 1 }{ r } \right) $$ where we have used 1 m as a normalizing length defining at the source. In the following, we examine the decrease in intensity levels with distance at three different altitudes, 10 km, 50 km, and 100 km.
If we move ~3 m away from the source, the intensity levels drop to ~85 dB, ~65 dB, and ~39 dB for, respectively. At ~10 m away, these intensities drop to ~74 dB, ~54 dB, and ~28 dB, respectively. At ~50 m away, these intensities drop to ~60 dB, ~40 dB, and ~14 dB, respectively. And at ~150 m away, the intensities drop to ~51 dB, ~31 dB, and ~5 dB, respectively. For comparison at sea level, the intensities would be ~90 dB, ~66 dB, and ~56 dB at distances of ~3 m, ~50 m, and ~150 m, respectively.
Thus, at 100 km altitude one need only move a little over 100 m away from the source before the intensity level drops below the hearing threshold (i.e., ~5 dB for a 20 year old male at 1000 Hz).
Answer 2
The model only went to 100 km but even so, our source would become difficult to hear if we moved a little more than ~100 m from it. Given that the density decreases exponentially with an e-folding distance of only ~8.5 km (pressure does so similarly as well), if we extrapolate our estimates for $L_{i,src}\left( h \right)$ then the value drops to ~10 dB by ~177 km.
So by ~200 km a human probably could not hear a source ~1 m away that produced a 100 dB, 1000 Hz intensity level at sea level.