You don't compare the individual points to conclude a treatment effect. You see whether the lines for the treatment and control are different.
In some circumstances, the fitted lines might be parallel, and just the difference in intercept is of interest. In others, both the intercept and slope might differ, and any difference would be of interest.
Testing point vs line in ordinary regression (not errors-in-variables, which is more complicated):
It's not correct to check if data values for another are in the confidence interval because the data values themselves have noise.
Call the first sample $(\underline{x}_1,\underline{y}_1)$, and the second one $(\underline{x}_2,\underline{y}_2)$. Your model for the first sample is $y_1(i) = \alpha_1 + \beta_1 x_{1,i} + \varepsilon_i$, with the usual iid $N(0,\sigma^2)$ assumption on the errors.
You want to see if a particular point $(x_{2,j},y_{2,j})$ is consistent with the first sample. Equivalently, to check whether an interval for $y_{2,j} - \left(\alpha_1 + \beta_1 x_{2,j}\right)$ includes 0 (notice the points are second-sample, the line is first-sample).
The usual way to obtain such CI would to construct a pivotal quantity, though one could simulate or boostrap as well.
However, since in this illustration we're doing it for a single point, under normal assumptions and with ordinary regression conditions, we can save some effort: this is a solved problem. It corresponds to (assuming sample 1 and sample 2 have a common population variance) checking whether one of the sample 2 observations lies within a prediction interval based on sample 1, rather than a confidence interval.
Bootstrap sample your training data many times (let's say, N times) and train a model from each bootstrapped sample (giving N models). Calculate a prediction on your test set using each model (giving N predictions for each point in your test set). This will allow you to calculate a confidence band for each test point's prediction. For plotting purposes, you may find it useful to fit a LOESS curve (or some other type of smoothing method) to the plot of insect size vs. CI for each element of your test set, giving an estimated mean confidence as a function of insect size.
Best Answer
Regarding your comment " I'm simply not used to the visual aspects of graphing smaller data sets." here is your data with 3, 4, 5, and finally all 6 data points. Notice the progressively tighter confidence intervals. All graphs have the same scales. This is effectively a visual illustration of Peter Flom's advice in the comments.
Three Data Points:
Four Data Points:
Five Data Points:
Six Data Points: