Assuming a normal distribution for the random effects is convenient from a computational point of view. However, it could be rather restrictive. In general, incorrect distribution assumption for the random effects has unfavorable influence on statistical inferences; see e.g.
1) Agresti, A., Caffo, B. & Ohman-Strickland, P. (2004). Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies, Computational Statistics and Data Analysis, 47, 639-653.
2) Heagerty, P. J. & Kurland, B. F. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models, Biometrika, 88, 973-985.
3) Litiere, S., Alonso, A. & Molenberghs, G. (2007). Type I and type II error under random-effects misspecification in generalized linear mixed models, Biometrics, 63, 1038-1044.
This motivates the search for mixed models with more flexible distributions for the
random effects. For instance, there has been much literature on non-parametric
modeling, such as Dirichlet process models and stick-breaking processes.
Komarek, A. & Lesaffre, E. (2008), CSDA, 52, 3441-3458, have proposed a parametric extension, replacing the normal distribution in generalized linear mixed models with a penalized Gaussian mixture distribution.
Some others have proposed to use skew normal distribution as extensions to the normal. For example, see Hosseini, F., Eidsvik, J. and Mohammadzadeh, M. (2011). Approximate Bayesian inference in spatial generalized linear mixed models with skew normal latent variables, Computational Statistics and Data Analysis, 55, 1791-1806.
Best Answer
The main reason that the normal distribution is so popular is because it works (is at least good enough in many situations). The reason that it works is really because of the Central Limit Theorem. Rather than trying to look beyond the CLT, I think you (and others) should better appreciate the CLT (I have a cross-stitch of the CLT hanging on my wall as I type).
We usually teach and think about the CLT in terms of a sample mean (and that is a powerful use of the CLT), but it extends much further than that. The CLT also means that any variable that we measure that is the result of combining many effects (many relative to the degree of relationship between the different pieces) will be approximately normal.
For example: a person's height is determined by many small effects including genetics (there will be several genes that contribute to height), nutrition (not just good/bad, but what was actually eaten each day that the person was growing), environmental polutions (again each day contributed a small effect), and other things. So heights (within sex/race combinations) are approximately normal.
Annual rainfall for a specific area is the summation of the daily rainfall for the year and while the daily rainfall is probably very far from normal (zero inflated) when you add all those days together you get something much more normal.
Binomial distributions are just sums of Bernoullis and a Poisson distribution can be the sum of smaller Poissons, it should not be a surprise that either can be approximated by a normal (if enough pieces are added together).
Most exceptions come when common values are close to a natural boundary (rainfall in the desert, test scores where many students get 100% or close to it, etc.) or when there is a single (or small number) of very strong contributors (height including both sexes or with a spread of ages when kids are still growing). Otherwise there are many things that can be approximated using the normal distribution (and things become even more normal when you average them together from a sample).
So why do we need any more justification than the CLT (not to take away from the other great answers).
dismount soapbox
addition
Since it appears that at least 2 people want to see the cross-stitch (based on comments below) here is a picture:
I also have cross-stitches of Bayes theorem and the mean value theorem of integration, but they are off topic for this question.