Correlation between two independent variables is not necessarily a sign of troublesome colinearity. The guru of colinearity, David Belsley, has shown this in his books: Conditioning Diagnostics: Colinearity and Weak Data in Regression and Regression Diagnostics: Identifying Influential Points and Sources of Collinearity.
In the comments, @Whuber points out that colinearity is not always a problem that has to be dealt with and that your maximum condition index indicates that, here, it is not even a problem at all.
At the other end, it is also possible to have very high colinearity without any high correlations. One example of this is if there are 10 IVs, 9 of which are independent and the 10th is the sum of the other 9.
In addition to condition indexes (and developed after Belsley's books were written) in R
there is the perturb
package that examines the problem of colinearity by adding small amounts of random noise to the input data and seeing what happens; one of the problems that colinearity can cause is that small changes to the input data can lead to huge changes in the regression results. In one of Belsley's books he gives an example where changing the data in the third or fourth significant digit reverses the signs of regression coefficients.
This is an interesting question, and actually quite subtle. It gets at the core of the definition of interactions, as well as the assumptions underlying dummy coding of categorical variables.
OP's question is that, when we are testing for interactions between the levels as given above, why do we (seemingly!) not bother to take into account interactions between one variable and the other variable's reference level?
The short answer is that we actually are taking these into account, as I'll explain here. First, to simplify subsequent notation, let $Y_{AE}$ denote the random variable $Y|\{X_1=A\} \cap \{X_2=E\}$, that is, $Y$ conditional on observing levels $A$ and $E$ in variables $X_1$ and $X_2$, respectively.
Now, recall that if we claim that there is no interaction between $X_1$ and $X_2$, this is equivalent to asserting that
$$
E[Y_{AE}]-E[Y_{BE}] = E[Y_{AF}]-E[Y_{BF}] = E[Y_{AG}]-E[Y_{BG}] \qquad (Eq. 1)
$$
that is, the expected difference between $Y|X_1=A$ and $Y|X_1=B$ is unrelated to the level of $X_2$, as long as $X_2$ is held constant. Similarly, no interaction means that
$$ E[Y_{AE}]-E[Y_{AF}] = \cdots = E[Y_{DE}]-E[Y_{DF}]. $$
Now, OP correctly states that given our choice of reference levels $A$ and $E$, the definition of $\beta_1$ is the expected difference between $Y$ when $X_1=B$ versus $X_1=A$, given that $X_2$ is held constant at $E$. In other words,
$$ \beta_1 = E[Y_{AE}] - E[Y_{BE}], $$
but if there is no interaction between $X_1$ and $X_2$, it follows immediately from equation 1 above that
$$
\beta_1 = E[Y_{AE}]-E[Y_{BE}] = E[Y_{AF}]-E[Y_{BF}] = E[Y_{AG}]-E[Y_{BG}]
$$
In other words, if there is no interaction, then $\beta_1$ is sufficient to explain the differences in $Y$ based on levels of $X_1$ when $X_2$ is held constant. This is why the test for an interaction involves leaving $\beta_1$ in the model and testing whether the interaction terms are zero.
The full answer to your question involves a simultaneous derivation for $\beta_1$ through $\beta_5$, but from this example you hopefully get the idea of what's going on.
Best Answer
Yes, it affects the power in three ways.
First, adding $X_2X_3$ to the model changes the true value of $\beta_1$ unless $X_2X_3$ is uncorrelated with $X_1$. In some designed experiments it would be natural for these to be uncorrelated, but in other sorts of data there typically isn't a reason to expect them to be uncorrelated. The coefficient could change by a large or small amount in either direction; the power could go up or down. The power might even become not-very-well-defined if the new value of $\beta_1$ was 0.
Second, again if $X_2X_3$ is correlated with $X_1$, putting it in the model will affect the variance of $\hat\beta_1$ because the variance of $\hat\beta_1$ is inversely proportional to the variance of $X_1$ conditional on everything else in the model. This effect will tend to reduce the power; the variance conditional on $X_2X_3$ is smaller than the variance not conditional on it [*]
Third, adding $X_2X_3$ to the model will tend to reduce the residual variance (if its coefficient is not zero), and so reduce the standard error of $\hat\beta_1$ and increase the power.
[*] I'm being loose with language, 'conditional' here is about linear projections rather than true conditional expectations.