After much investigation, simulation and a deep literature search, I've figured out the true answer.
You perceive a chirp because you are being hit with the echos of the sharp noise that generated the sound. The times between the arrival of those echos is decreasing inversely with time, so it sounds as if it were a tone with a fundamental frequency increasing linearly in time, hence the chirp.
To get a feel for the phenomenon, consider a simulation:
![Simulated sound](https://i.stack.imgur.com/cIK3u.gif)
Above you see a slowed down version of the simulated pressure wave inside a 2D racquetball court. I threw up the generated sound on soundcloud.
If you watch the simulation, pick a particular point and watch the reflected sounds go by, you'll notice the different instances of the multiple echos arrive faster and faster as time goes on.
You can clearly hear the chirps in the generated sound, and if you listen closely you can hear secondary chirps as well. These are also visible in the spectrogram:
![Spectrogram of simulation](https://i.stack.imgur.com/wBD3q.png)
This phenomenon was studied and published recently by Kenji Kiyohara, Ken'ichi Furuya, and Yutaka Kaneda: "Sweeping echoes perceived in a regularly shaped reverberation room ," J. Acoust. Soc. Am. Vol.111, No.2, 925-930 (2002). more info
In particular, they explain not only the main sweep, but the appearance of the secondary sweeps using some number theory. Worth reading in full. This suggests that for the best sweep one should both stand and listen in the center of the room, though they should be generic at any location.
Simple geometric argument
Following the paper, we can give a simple geometric argument. If you imagine standing in the middle of a standard racquetball court, which is twice as long as it is tall or wide, and clap, your clap will start propagating and reflecting off the walls. A simple way to study the arrival times is with the method of images, so you imagine other claps generated by reflecting your clap across the walls, and then reflections of those claps and so on. This will generate a whole set of "image" claps, located at positions
$$ ( m, l, 2k) L $$
where $m,l,k$ are integers and $L$ is 20 feet for a racquetball court, the time for any particular clap to reach you is $t = d/c$ and so we have
$$ t = \sqrt{m^2 + l^2 + 4k^2} \frac{L}{c} $$
for our arrival times. If we look at how these distribute in time:
![Arrival times](https://i.stack.imgur.com/lDAE8.png)
It becomes clear why we perceive a chirp. The various sets of missing bars, which themselves are spaced like a chirp, give rise to our perceived subchirps.
Details of the 2D Simulation
For the simulation, I numerically solved the wave equation:
$$ \frac{\partial^2 p}{dt^2} = c^2 \nabla^2 p $$
and used impedance boundary conditions on the walls
$$ \nabla p \cdot \hat n = -c \eta \frac{\partial p}{\partial t} $$
I used a collocation method spatially, with a Chebyshev basis of order 64 in the short axis and 128 on the long axis. and used RK4 for the time integration.
I modeled the room as 20 feet by 40 feet and started it of with a gaussian pressure pulse in one corner of the room. I listened near the back wall towards the top corner.
I put up an ipython notebook of my code, with the embedded audio and video. I recommend playing with it yourself. On my desktop it takes about minute to do a full simulation of the sound.
Effect of listening location
I've updated the code to generate sound at multiple locations, and generate their sounds. I can't seem to embed audio on stackexchange, but if you click through to the IPython notebook view, you can listen to all of the generated sounds. But what I can do here is show the spectrograms:
![Spectrograms at various locations](https://i.stack.imgur.com/bYd6f.png)
These are laid out in roughly their locations inside of the room. Here the noise was generated in the lower left, but the chirps should be generic for any listening and generation location.
Best Answer
What is probably taking place is this: when something hot is inside the thermos, the air trapped between the outside of the thermos liner and the inside of the plastic sleeve surrounding it begins expanding due to (slow) heat loss from the thermos liner. where the liner meets the sleeve around the upper end of the assembly there is a press-fit joint which is probably 1) not totally airtight and 2) has a little water, tea, coffee, etc. sitting in it. when the pressure inside this space rises to a certain level, it pushes air out through that space against the surface tension of the water in that crevice, the fluid rebuilds the closure, it gets pushed out, etc. creating that brief bubbling noise. once the internal pressure falls below that threshold, the bubbling stops and the pressure builds up again.
You can test this idea by filling the thermos with something hot, capping it, and immersing it in a tub of water. watch for bubbles coming from one of the joints.