I'll eliminate all the biological details and experiments and quote just the problem at hand and what I have done statistically. I would like to know if its right, and if not, how to proceed. If the data (or my explanation) isn't clear enough, I'll try to explain better by editing.
Suppose I have two groups/observations, X and Y, with size $N_x=215$ and $N_y=40$. I would like to know if the means of these two observations are equal. My first question is:

If the assumptions are satisfied, is it relevant to use a parametric twosample ttest here? I ask this because from my understanding its usually applied when the size is small?

I plotted histograms of both X and Y and they were not normally distributed, one of the assumptions of a twosample ttest. My confusion is that, I consider them to be two populations and that's why I checked for normal distribution. But then I am about to perform a twoSAMPLE ttest… Is this right?

From central limit theorem, I understand that if you perform sampling (with/without repetition depending on your population size) multiple times and compute the average of the samples each time, then it will be approximately normally distributed. And, the mean of this random variables will be a good estimate of the population mean. So, I decided to do this on both X and Y, 1000 times, and obtained samples, and I assigned a random variable to the mean of each sample. The plot was very much normally distributed. The mean of X and Y were 4.2 and 15.8 (which were the same as population + 0.15) and the variance was 0.95 and 12.11.
I performed a ttest on these two observations (1000 data points each) with unequal variances, because they are very different (0.95 and 12.11). And the null hypothesis was rejected.
Does this make sense at all? Is this correct / meaningful approach or a twosample ztest is sufficient or its totally wrong? 
I also performed a nonparametric Wilcoxon test just to be sure (on original X and Y) and the null hypothesis was convincingly rejected there as well. In the event that my previous method was utterly wrong, I suppose doing a nonparametric test is good, except for statistical power maybe?
In both cases, the means were significantly different. However, I would like to know if either or both the approaches are faulty/totally wrong and if so, what is the alternative?
Best Answer
The idea that the ttest is only for small samples is a historical hold over. Yes it was originally developed for small samples, but there is nothing in the theory that distinguishes small from large. In the days before computers were common for doing statistics the ttables often only went up to around 30 degrees of freedom and the normal was used beyond that as a close approximation of the t distribution. This was for convenience to keep the ttable's size reasonable. Now with computers we can do ttests for any sample size (though for very large samples the difference between the results of a ztest and a ttest are very small). The main idea is to use a ttest when using the sample to estimate the standard deviations and the ztest if the population standard deviations are known (very rare).
The Central Limit Theorem lets us use the normal theory inference (ttests in this case) even if the population is not normally distributed as long as the sample sizes are large enough. This does mean that your test is approximate (but with your sample sizes, the appromition should be very good).
The Wilcoxon test is not a test of means (unless you know that the populations are perfectly symmetric and other unlikely assumptions hold). If the means are the main point of interest then the ttest is probably the better one to quote.
Given that your standard deviations are so different, and the shapes are nonnormal and possibly different from each other, the difference in the means may not be the most interesting thing going on here. Think about the science and what you want to do with your results. Are decisions being made at the population level or the individual level? Think of this example: you are comparing 2 drugs for a given disease, on drug A half the sample died immediatly the other half recovered in about a week; on drug B all survived and recovered, but the time to recovery was longer than a week. In this case would you really care about which mean recovery time was shorter? Or replace the half dying in A with just taking a really long time to recover (longer than anyone in the B group). When deciding which drug I would want to take I would want the full information, not just which was quicker on average.