All the time I see examples of the normal/Gaussian distribution with continuous random variables. So my question is do all continuous random variables have a Gaussian distribution?
Solved – Are all continuous random variables normally distributed
mathematical-statisticsnormal distribution
Related Solutions
It's important to make the distinction between a sum of normal random variables and a mixture of normal random variables.
As an example, consider independent random variables $X_1\sim N(\mu_1,\sigma_1^2)$, $X_2\sim N(\mu_2,\sigma_2^2)$, $\alpha_1\in\left[0,1\right]$, and $\alpha_2=1-\alpha_1$.
Let $Y=X_1+X_2$. $Y$ is the sum of two independent normal random variables. What's the probability that $Y$ is less than or equal to zero, $P(Y\leq0)$? It's simply the probability that a $N(\mu_1+\mu_2,\sigma_1^2+\sigma_2^2)$ random variable is less than or equal to zero because the sum of two independent normal random variables is another normal random variable whose mean is the sum of the means and whose variance is the sum of the variances.
Let $Z$ be a mixture of $X_1$ and $X_2$ with respective weights $\alpha_1$ and $\alpha_2$. Notice that $Z\neq \alpha_1X_1+\alpha_2X_2$. The fact that $Z$ is defined as a mixture with those specific weights means that the CDF of $Z$ is $F_Z(z)=\alpha_1F_1(z)+\alpha_2F_2(z)$, where $F_1$ and $F_2$ are the CDFs of $X_1$ and $X_2$, respectively. So what is the probability that $Z$ is less than or equal to zero, $P(Z\leq0)$? It's $F_Z(0)=\alpha_1F_1(0)+\alpha_2F_2(0)$.
The calculation of such probabilities has been studied extensively by communications engineers under the name $M$-ary orthogonal signaling where the model is that one of $M$ equal-energy equally likely orthogonal signals being transmitted and the receiver attempting to decide which one was transmitted by examining the outputs of $M$ filters matched to the signals. Conditioned on the identity of the transmitted signal, the sample outputs of the matched filters are (conditionally) independent unit-variance normal random variables. The sample output of the filter matched to the signal transmitted is a $N(\mu,1)$ random variable while the outputs of all the other filters are $N(0,1)$ random variables.
The conditional probability of a correct decision (which in the present context is the event $C = \{X_0 > \max_i X_i\}$) conditioned on $X_0 = \alpha$ is $$P(C \mid X_0 = \alpha) = \prod_{i=1}^n P\{X_i < \alpha \mid X_0 = \alpha\} = \left[\Phi(\alpha)\right]^n$$ where $\Phi(\cdot)$ is the cumulative probability distribution of a standard normal random variable, and hence the unconditional probability is $$P(C) = \int_{-\infty}^{\infty}P(C \mid X_0 = \alpha) \phi(\alpha-\mu)\,\mathrm d\alpha = \int_{-\infty}^{\infty}\left[\Phi(\alpha)\right]^n \phi(\alpha-\mu)\,\mathrm d\alpha$$ where $\phi(\cdot)$ is the standard normal density function. There is no closed-form expression for the value of this integral which must be evaluated numerically. Engineers are also interested in the complementary event -- that the decision is in error -- but do not like to compute this as $$P\{X_0 < \max_i X_i\} = P(E) = 1-P(C)$$ because this requires very careful evaluation of the integral for $P(C)$ to an accuracy of many significant digits, and such evaluation is both difficult and time-consuming. Instead, the integral for $1-P(C)$ can be integrated by parts to get $$P\{X_0 < \max_i X_i\} = \int_{-\infty}^{\infty} n \left[\Phi(\alpha)\right]^{n-1}\phi(\alpha) \Phi(\alpha - \mu)\,\mathrm d\alpha.$$ This integral is more easy to evaluate numerically, and its value as a function of $\mu$ is graphed and tabulated (though unfortunately only for $n \leq 20$) in Chapter 5 of Telecommunication Systems Engineering by Lindsey and Simon, Prentice-Hall 1973, Dover Press 1991. Alternatively, engineers use the union bound or Bonferroni inequality $$\begin{align*} P\{X_0 < \max_i X_i\} &= P\left\{(X_0 < X_1)\cup (X_0 < X_2) \cup \cdots \cup (X_0 < X_n)\right\}\\ &\leq \sum_{i=1}^{n}P\{X_0 < X_i\}\\ &= nQ\left(\frac{\mu}{\sqrt{2}}\right) \end{align*}$$ where $Q(x) = 1-\Phi(x)$ is the complementary cumulative normal distribution function.
From the union bound, we see that the desired value $0.01$ for $P\{X_0 < \max_i X_i\}$ is bounded above by $60\cdot Q(\mu/\sqrt{2})$ which bound has value $0.01$ at $\mu = 5.09\ldots$. This is slightly larger than the more exact value $\mu = 4.919\ldots$ obtained by @whuber by numerical integration.
More discussion and details about $M$-ary orthogonal signaling can be found on pp. 161-179 of my lecture notes for a class on communication systems'
Best Answer
No.
Lots of real life variables have distributions which are better described as other distributions. t-distributions (heavier tails) are common, as are various skewed distributions, for example, many real measurements must be positive, so greater than or equal to zero, but can have a long tail of high values. Quite a lot of real world data is counts, or similar integer data, which is often better described by a Poisson distribution.
In my personal experience, in epidemiology, bio-medicine, and sociology, genuinely 'normal' distributions, that is real data which can best described as a normal distribution, are uncommon, but it does depend on the field you work in, and exactly what data you are looking at.