In a 2 x 2 design, it is fairly easy to run a bootstrap test of the interaction. Let define the four conditions as a, b, c and d. The conditions $a$ and $b$ are the question factor for the treatment and $c$ and $d$ are the question factor for the control condition. The mean interaction contrast (MIC) is given by
$ (a + d) - (b + c).$
It quantifies the amount of non-additivity in the dataset. If MIC is zero, it means that there is a main effect of questions (there is an increase--or decrease-- from Q1 to Q2), a main effect of conditions (there is an increase --or decrease-- from control to treatment) and no interaction. If such is the case, mean in $b$ is a few points above mean in $a$, and mean in $c$ is also the same amount of points above the mean in $d$. As of treatment, the same occur (treatment measures are a few point above control measures). Defining the first increment as $d_1$ and the second as $d_2$, the means are thus
$
\left(\begin{matrix}M_a \;\;\; M_b \\ M_c\;\;\;M_d \end{matrix}\right) =
\left(\begin{matrix}M_a \;\;\;M_a+d_1 \\ M_c \;\;\; M_c+d_1 \end{matrix}\right) =
\left(\begin{matrix}M_a \;\;\; M_a+d_1 \\ M_a+d_2\;\;\;M_a+d_2+d1 \end{matrix}\right)
$
so that
$MIC = (M_a+(M_a+d_1+d_1)) - ((M_a+d_1)+(M_a+d_2)) = 0$.
Thus, to do a boostrap estimate, sub-samples in the groups with replacement, and compute MIC. Repeat this a very large number of times (say 5,000). Finally, find the range in which 95% of the MIC found are located. If this interval includes 0, then the interaction is not significantly different from zero.
This reasoning works for a fully between group design. In a mixed design, you have to select pairs of scores randomly before computing MIC (preserving subjects' two measures).
Best Answer
Just ignoring missing data (i.e. analyzing only the observed data) asssumes that the observed available data are completely representative of the missing data, which requires that the missingness has no connection whatsoever with the outcomes you are interested in (this is called "missing completely at random", MCAR). This is very rarely the case. Additionally, while analyzing only the complete cases may be valid, if this were the case, it would not be the most efficient analysis (and usually a mixed model assuming MAR - see below - is more efficient).
Doing a mixed effects model that implicitly imputes the missing values assumes that missingness can be explained by randomness, the model covariates, as well as the observed values (this is called "missing at random", MAR). An analysis valid under MAR is also valid under MCAR (MCAR being part of what is considered MAR). There are also other options besides a mixed model, e.g. there's the option of doing some kind of multiple imputation (possibly having more variables in the imputation model than in the analysis model) and then doing an analysis by time point.
You can actually distinguish MAR from MCAR based on your data, but you cannot tell whether instead of one of these two situations you have a missing completely not at random situation, in which neither of the two analysis options mentioned above would be valid. With people joining late, you will have to think about whether it seems plausible that this has very little to do with the missing outcomes (or perhaps these people have different observed characteristics, but you think it's plausible that that's the main difference).