Okay, a few things:
1) A two-sample t-test does not assume the distributions of groups A and B are the same under the null hypothesis, even if the underlying distributions are both normal. That can only occur if you assume the standard deviations are the same, which is a hefty assumption to have. The two-sample t-test tests whether, under the null, the means are the same of the two groups. But yes, the classical two-sample t-test assumes the underlying data is normally distributed. This is the case because you do not only need the numerator to be normally distributed, but the variances also be (a scaled version of) a $\chi^2$. That being said, the t-test is fairly robust against the assumption of normality. See here.
2) It is true under a large enough sample, the underlying distribution of the means of each group is going to be approximately normal. How good that approximation depends on the underlying distribution of each group.
The general idea is this. If $X$ and $Y$ are independent, with $X$ having mean $\mu_X$ and standard deviation $\sigma_X$ and $Y$ having mean $\mu_Y$ with standard deviation $\sigma_Y$, and the respective sample $X_1,\dots,X_n$ and $Y_1,\dots,Y_m$ are large, then you can conclude
$$
\frac{\bar{X}-\bar{Y}-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma^2_X}{n}+\frac{\sigma^2_Y}{m}}}$$
Is approximately normal with mean 0 and standard deviation 1. So critical values $z_{\alpha/2}$ can be used to do testing. Also, $t_{\alpha/2,\nu}$ are going to be close to $z_{\alpha/2}$, when $\nu$ is large (which occurs if the sample sizes is large). So for large enough sample sizes, a t-test can be used.
There are ways to check for this. (The standard rule of thumb is that each group has a sample size of 30 or larger, but I am usually against those rules because there are plenty of cases where that rule fails). One way you can check it (sort of) is to create a bootstrap distribution of the mean and see.
3) You can do better than approximate tests though. When you are testing to see if the means differ, your real question is really to see if the locations differ. A test that will be correct (almost) all the time will be the Mann Whitney U test. This does not test whether the means differ, but rather if the medians differ. In other words, it again tests whether one location differs from another. It may be a better option, and has a pretty high power overall.
To understand this, you need to first state a version of the Central Limit Theorem. Here's the "typical" statement of the central limit theorem:
Lindeberg–Lévy CLT. Suppose ${X_1, X_2, \dots}$ is a sequence of i.i.d.
random variables with $E[X_i] = \mu$ and $Var[X_i] = \sigma^2 < \infty$.
Let $S_{n}:={\frac {X_{1}+\cdots +X_{n}}{n}}$. Then as
$n$ approaches infinity, the random variables $\sqrt{n}(S_n − \mu)$ converge
in distribution to a normal $N(0,\sigma^2)$ i.e.
$${\displaystyle {\sqrt {n}}\left(\left({\frac {1}{n}}\sum
_{i=1}^{n}X_{i}\right)-\mu \right)\ {\xrightarrow {d}}\ N\left(0,\sigma ^{2}\right).}$$
So, how does this differ from the informal description, and what are the gaps? There are several differences between your informal description and this description, some of which have been discussed in other answers, but not completely. So, we can turn this into three specific questions:
- What happens if the variables are not identically distributed?
- What if the variables have infinite variance, or infinite mean?
- How important is independence?
Taking these one at a time,
Not identically distributed, The best general results are the Lindeberg and Lyaponov versions of the central limit theorem. Basically, as long as the standard deviations don't grow too wildly, you can get a decent central limit theorem out of it.
Lyapunov CLT.[5] Suppose ${X_1, X_2, \dots}$ is a sequence of independent
random variables, each with finite expected value $\mu_i$ and variance $\sigma^2$
Define: $s_{n}^{2}=\sum _{i=1}^{n}\sigma _{i}^{2}$
If for some $\delta > 0$, Lyapunov’s
condition
${\displaystyle \lim _{n\to \infty }{\frac {1}{s_{n}^{2+\delta }}}\sum_{i=1}^{n}\operatorname {E} \left[|X_{i}-\mu _{i}|^{2+\delta }\right]=0}$ is satisfied, then a sum
of $X_i − \mu_i / s_n$ converges in distribution to a standard normal
random variable, as n goes to infinity:
${{\frac {1}{s_{n}}}\sum _{i=1}^{n}\left(X_{i}-\mu_{i}\right)\ {\xrightarrow {d}}\ N(0,1).}$
Infinite Variance Theorems similar to the central limit theorem exist for variables with infinite variance, but the conditions are significantly more narrow than for the usual central limit theorem. Essentially the tail of the probability distribution must be asymptotic to $|x|^{-\alpha-1}$ for $0 < \alpha < 2$. In this case, appropriate scaled summands converge to a Levy-Alpha stable distribution.
Importance of Independence There are many different central limit theorems for non-independent sequences of $X_i$. They are all highly contextual. As Batman points out, there's one for Martingales. This question is an ongoing area of research, with many, many different variations depending upon the specific context of interest. This Question on Math Exchange is another post related to this question.
Best Answer
Confirming Whuber's comment, this is not what the central limit theorem says. The distribution does not get less skewed as the sample size increases. All you get is a more and more accurate picture of the shape of the true distribution in the population (just as you get a more accurate estimate of the mean, the SD, etc).
What the central limit theorem says (amongst other things) is that the sampling distribution of the mean gets closer to normal as the sample size gets bigger. This sampling distribution is the distribution of means of the samples; in other words if you took lots of samples of 50,000 items, and plotted the means of those samples as a new distribution in their own right, that histogram would tend to normality, regardless of the distribution of the original means. It is this that allows you to carry out a t-test regardless of the normality of the original distribution - when the sample size is large enough - and there can surely be no doubt that 50,000 is going to be 'large enough' in this context. [Note: I clarified "of the mean" in the first sentence and added "surely" in the final sentence after reading comments on my answer.]