This is definitely not an illusion, as many people have the same experience. I have usually lived in places miles away from train stations, which makes it unlikely to hear any train horns during the day. However at night, occasionally train horns can be heard. I hope if any one can explain the physics of this effect, like possibly sound travels faster at low temperature.
Acoustics – Why Nighttime Increases Remote Train Horn Audibility
acousticseveryday-life
Related Solutions
It isn't possible to create an audio source in mid-air using the method you've described. This is because the two ultrasonic waves would create an audible source if the listener were standing at that spot, but those waves would continue to propagate in the same direction afterwards. You would need, as I point out below, some sort of medium which scattered the waves in all directions to make it seem as if the sound were coming from the point at which you interfered the two waves.
It is possible, however, to make the user percieve the sound as coming from a specific location, but it isn't as easy as the author makes it seem. I can think of two different ways. First of all, as described by @reirab, you can get audio frequencies by interfering two sound waves of high frequency. When they interfere they will generate a beat note which has the frequency of the difference between the two frequencies. I.E. if you send a sound beam with frequency $f_1=200\ \text{kHz}$ and another beam with $f_2=210\ \text{kHz}$, the frequency heard in the region where they combine will be $\Delta f-=f_2-f_1=10\ \text{kHz}$ which is in the audio band of humans.
There is an additional difficulty. You will need the sound to come out in a well-defined, narrow (collimated) beam, and this is not terribly easy to do. A typical speaker emits sound in all directions. There are many techniques for generating such beams, but one is to use a phased array.
How can you use this to make a person perceive the sound as coming from a specific point?
Sending Two Different Volumes to the Two Ears
What does it mean to perceive sound as coming from a specific location? Our ears are just microphones with cones which accept sound mostly from one direction (excepting low frequencies). A large part of the way we determine where the sound came from is just the relative volume in our two ears. So, you could use the interference effect described above with beams which are narrow enough that you can target each ear. By using two separate sets of beams targeting each ear with different volumes, you could make the person perceive the sound as coming from a specific location; at least as well as a 3D movie makes a person perceive images in 3D.
Hitting a Material Which Scattered the Sound Isotropically
The second method is to use the same interference effect, but this time combining the two beams at a point where a material scattered the sound waves in all directions. I'm going to be honest, I'm not sure how realistic such materials are, but lets assume they exist for now. If you did so, the two sound beams would be scattered with equal amplitude in all directions and the person you are trying to fool would percieve the sound as coming from this point. This method has the advantage of truly sounding to the person as if the sound came from that direction in all respects including reflections, phasing, etc.
In summary, the idea is definitely possible (maybe there are more ways than I've given), but it isn't as simple as the passage in the book makes it out to be.
If we ignore reverberation, sound intensity follows the inverse-square law and falls off with the square of distance. Since distance does not depend on direction, if both of you are talking at roughly the same level than the speech "signal" at your ears will be roughly the same. The noise, however, is closer, and hence more intense, to the person in the noisy room. This means the signal-to-noise ratio (SNR) is lower for the person closer to the noise source. Our ability to both hear sounds and understand speech depend critically on the SNR.
Consider the following worked example:
If both talkers are speaking such that the sound level 1 m away is 80 dB SPL, then the sound level 10 m away will be 60 dB SPL.
If the sound level of the noise in the room, 1 m from the center of the room is 80 dB SPL, then the sound level 10 m away will be 60 dB SPL.
Assuming the two talkers are 10 m apart and one is 1 m from the center of the room and the other is 10 m from the center of the room, then at each listeners ears the signal level will be 60 dB SPL. The noise level at ears of the listener near the center will be 80 dB SPL, giving a SNR of -20 dB which will mean the speech is inaudible and unintelligible. The noise level at the ears of the listener far from the center of the room will be 60 dB SPL and the SNR will be 0 dB which will make the speech both clearly audible and nearly perfectly intelligible.
Best Answer
There are two things that can be considered: one is trivial - that it is quieter at night so you are more likely to hear the horn.
The second is physics: the speed of sound depends on the square root of temperature, so the refractive index is proportional to $T^{-1/2}$.
At night it is quite possible to get a temperature inversion, such that air near the ground is colder than higher up. This would normally occur in still conditions and I think is more common in winter.
As the refractive index decreases with height it means that sound waves propagating upwards at some angle to the horizontal will be bent back towards the ground. The sound waves at some distance from the source will be more intense than you might expect if the waves propagated isotropically.
The contrast with the daytime situation would be enhanced by a more normal temperature gradient where the refractive index increases with height.
EDIT: For an excellent visualisation of this effect, see these animations produced by Daniel Russell (Penn State)