Your specific question is about why uniform gas is a low entropy state for the universe. The reason is that you make entropy by allowing the gas to self-gravitate and compress, releasing heat to the environment in the process. The end result is a black hole where the gas is compressed maximally, and these are the maximum entropy gravitational states.
But the uniform gas comes from a nearly uniform inflaton field shaking over all space after inflation. This inflaton produces uniform denisty of matter, which then becomes uniform baryons and hydrogen. Ultimately, it is the uniformity of the energy density in the inflaton field which is responsible for the low entropy of the initial conditions, and this is linked to the dynamics of inflation.
The dynamics of inflation produce low entropy initial conditions without fine tuning. This seems like a paradox, because low entropy is fine tuning by definition, don't you need to choose a special state to have low entropy? The answer in inflation is that the state is only special in that there is a large positive cosmological constant, but it is otherwise generic, in that it is a maximum entropy state given the large cosmological constant.
The theory of inflation explains the specialness of the initial conditions completely. This was proposed by Davies in 1983, but it is rejected by cosmologists. The rest of this answer discusses arguments that support Davies' position.
deSitter space
If you consider a deSitter space with some mass density added, and you look in a causal patch (meaning what one observer can see), the mass density gives an additional curvature without (significant) pressure and turns deSitter into more like a sphere. There is a continuous deformation of deSitter space into the Einstein static universe, which is obtained by making the density of matter as large as possible.
Any matter you add reduces the horizon area of the cosmological horizon, and this is true for black holes as well. If you consider the ds-Schwartschild exact solution, for example, you can have an isolated black hole in deSitter space:
$$ \rm ds^2 = - f(r) \rm dt^2 + {\rm dr^2\over f(r) } + r^2 \rm d\Omega^2 $$
$$ f(r) = 1 - {2m\over r} - {\Lambda r^2\over 3} $$
but there are two horizons, and the causal patch is the region between the black hole and the cosmological horizon. It is easy to check that the total horizon area, cosmological plus black-hole is maximum for m=0. It is also easy to check that there is a certain value of m where the black hole radius and the cosmological radius degenerate. At this degeneration, the distance between the black hole and cosmological horizon stays constant, they do not collide except in the bad r coordinate, and the space turns into AdS_2 x S_2.
Nariai dynamics
Imagine starting near a Nariai solution with additional matter between the two horizons. These are both still black holes, neither is a cosmological horizon, as you can see by adding more matter with a uniform density, until you approach the limit of the Einstein static universe with two antipodal black holes.
This is a physical configuration of the static cosmology. So you can start with an Einstein static universe, and evolve it forward in time, you will produce black holes, and they will merge and grow.
If you take all the matter in the static universe and push it into one of the black holes, this black hole area will increase past the Nariai limit and it will become the cosmological horizon. At this point, the singularity runs away to infinity. If you push the matter into another black hole, the other black hole will be the cosmological horizon. It's up to you.
So if you start with the Einstein static universe, the black holes compete for mattter, until eventually the biggest black hole will surround all the others, and become the cosmological horizon.
The lessons are the following:
- Cosmological horizons are the same stuff as black hole horizons. Their other side is described by black hole complementarity, just as for black holes. It is wrong to think of the universe in a global picture.
- deSitter space is the maximum entropy configuration of a positive cosmological constant universe, everything else eventually thermalizes into deSitter space.
- The global picture of black holes is not particularly physical, because the singularity of the Nariai solution runs away to infinity in the Nariai limit. There are cases where black hole interior structure degenerates.
Inflation Produces Low Entropy Initial Conditions
The second point answers your question, because the early universe is in a deSitter phase. So given a large positive value of the cosmological constant in the early universe, the maximum entropy state is a deSitter space with a cosmological horizon of small area, and this is necessarily a low entropy initial condition for later times, during which the cosmological horizon grows.
There is no further explanation required for the low-entropy initial conditions. This is the same explanation as for all the other miracles of inflation, the killing of fluctuations, the flatness condition, the monopole problem. The whole point of inflation is to produce a theory of low entropy initial conditions, including gravity, and it does so naturally, because deSitter space is the only low entropy maximal entropy gravitational state. This answer was first given by Davies, and it is just plain correct.
This plain-as-the-nose-on-your-face idea is not accepted despite the nearly thirty years since Davies' paper. I should add that Tom Banks and Leonard Susskind both now say similar things, although I don't want to put words in their mouths.
Here is my answer. I should preface it by warning that this is a subject that can provoke intense discussion, and I'm sure there are physicists would would disagree. You should be aware that I'm an expert on thermodynamics but not on general relativity.
But basically, as far as I understand it, the process of converting matter into black-hole-stuff is an irreversible one, in the usual macroscopic sense. Throwing your boxes of salt into identical black holes is somewhat analogous to what would happen if you emptied them into two identical vats of water. You would end up with two identical vats of salty water, with the same mass, temperature, and salt concentration, and the same entropy.
The no hair theorem for black holes is an asymptotic one. It says that if you throw some stuff into a black hole and wait long enough, the black hole will become an arbitrarily good approximation to an "ideal" black hole (which is to say, a black hole solution of Einstein's equations), which can can be completely described by its mass, charge and spin. It also says (I believe) that this convergence happens rather rapidly. But converging towards something is not the same as ever actually reaching it. In reality, nothing can cross the event horizon as seen from an outside perspective (see my answer to this question), it just gets very hard to detect because its light is red-shifted to extremely long wavelengths.
So in my view the apparent loss of information comes from assuming that the black hole actually becomes an ideal one rather than just closely approximating it. It's very similar to the question of how the entropy of an isolated vat of salt+water can increase as the salt dissolves, even though on the microscopic level, the laws of physics seem to preserve information. The resolution is that when you switch to a macroscopic description (in terms of temperature, pressure etc.), you throw away some information about the microscopic state. After the salt has dissolved, the information about its previous state (crystal or powder) is still there, but it's hidden in fine correlations between the molecules' motions. When you choose to describe the final state as an equilibrium ensemble you're basically admitting that those fine correlations can never practically be measured, and therefore choosing to ignore them. Similarly, when you choose to approximate a real black hole as an ideal one, you're basically choosing to ignore any information about what kind of salt was thrown into it in the past, on the basis that there's no longer any practical way to recover it. In both cases, the fundamental reason for the increase in entropy is the same.
Note that I'm not saying the box's entropy increases as it passes the event horizon. I'm actually saying that the box never crosses the event horizon, as seen from an outside point of view. That would take an infinite amount of time. However, the outside observer would very rapidly find the box very hard to see due to the red-shifting. At some point you, as the observer, might decide as an approximation that the box might as well have crossed the event horizon, since you basically can't detect it anymore. When you do this, your approximation has a higher entropy than the "real" black hole, and that's where the entropy increase comes from.
That might seem like a weird concept. But in fact all increases in entropy are due to approximations of one kind or another. In principle you could always reverse the velocities of every particle making up a system and watch it "run backwards in time" to its initial state (unscrambling an egg or whatever). So the information about the initial conditions is always still there. We just treat things as irreversible (i.e. information-destroying or entropy-producing) because it's a very useful approximation that helps us make predictions about macroscopic systems.
Of course, the observer falling in with the box of salt would not want to make the same approximation as the outside observer. It would be a bad approximation from the infalling observer's point of view, because she can still see the box perfectly clearly. (If it's a big enough black hole it won't even get torn apart.) But that's ok - although we often treat it as an observer-independent physical quantity, entropy is actually observer-dependent, even for everyday things like gases. See this rather wonderful paper by Edwin Jaynes. (Jaynes, E. T., 1992, `The Gibbs Paradox, ' in Maximum-Entropy and Bayesian Methods, G. Erickson, P. Neudorfer, and C. R. Smith (eds.), Kluwer, Dordrecht).
Best Answer
The low-entropy initial state of the universe is an open problem without a satisfactory answer. Your question is the first time I've heard the suggestion that the initial state should have been a crystal; you remind me that the quark-gluon plasma, which was the state of the universe while it was too hot for nucleons to be stable, has been shown to be a minimum-entropy fluid.
Sean Carroll wrote a nice book on the subject a couple of years ago, which I think was an extension of this paper.