Yes. You can easily verify this by carrying out the following steps:
First, express the means of $A$, $B$, and $C$, in terms of the model with the specified contrast:
\begin{eqnarray*}
1\hat{\beta}_{0}-2\hat{\beta}_{1}+0\hat{\beta}_{2} & = & \hat{\mu}_{A}=E(Y_{A})\\
1\hat{\beta}_{0}+1\hat{\beta}_{1}-1\hat{\beta}_{2} & = & \hat{\mu}_{B}=E(Y_{B})\\
1\hat{\beta}_{0}+1\hat{\beta}_{1}+1\hat{\beta}_{2} & = & \hat{\mu}_{C}=E(Y_{C})
\end{eqnarray*}
Here, each $\hat{\mu}_i$ represents the group mean of group $i$, $i=A, B, C$.
Next, place each beta coefficient into a matrix augmented with the
means on the right, and place the matrix in row-reduced echelon form
using Guass-Jordan elimination:
\begin{eqnarray*}
\begin{bmatrix}1 & -2 & 0 & | & \hat{\mu}_{A}\\
1 & 1 & -1 & | & \hat{\mu}_{B}\\
1 & 1 & 1 & | & \hat{\mu}_{C}
\end{bmatrix} & \sim & \begin{bmatrix}1 & -2 & 0 & | & \hat{\mu}_{A}\\
0 & 3 & -1 & | & \hat{\mu}_{B}-\hat{\mu}_{A}\\
0 & 3 & 1 & | & \hat{\mu}_{C}-\hat{\mu}_{A}
\end{bmatrix}\\
& \sim & \begin{bmatrix}1 & -2 & 0 & | & \hat{\mu}_{A}\\
0 & 3 & -1 & | & \hat{\mu}_{B}-\hat{\mu}_{A}\\
0 & 0 & 2 & | & \left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)
\end{bmatrix}\\
& \sim & \begin{bmatrix}1 & -2 & 0 & | & \hat{\mu}_{A}\\
0 & 3 & -1 & | & \hat{\mu}_{B}-\hat{\mu}_{A}\\
0 & 0 & 1 & | & \frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]
\end{bmatrix}\\
& \sim & \begin{bmatrix}1 & -2 & 0 & | & \hat{\mu}_{A}\\
0 & 1 & 0 & | & \frac{1}{3}\left\{ \left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)+\frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]\right\} \\
0 & 0 & 1 & | & \frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]
\end{bmatrix}\\
& \sim & \begin{bmatrix}1 & 0 & 0 & | & \hat{\mu}_{A}+\frac{2}{3}\left\{ \left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)+\frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]\right\} \\
0 & 1 & 0 & | & \frac{1}{3}\left\{ \left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)+\frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]\right\} \\
0 & 0 & 1 & | & \frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]
\end{bmatrix}
\end{eqnarray*}
So, now, we know that the first pivot position corresponds to:
\begin{eqnarray*}
\hat{\beta}{}_{0} & = & \hat{\mu}_{A}+\frac{2}{3}\left\{ \left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)+\frac{1}{2}\left[\left(\hat{\mu}_{C}-\hat{\mu}_{A}\right)-\left(\hat{\mu}_{B}-\hat{\mu}_{A}\right)\right]\right\} \\
& = & \hat{\mu}_{A}-\frac{2}{3}\hat{\mu}_{A}-\frac{1}{3}\hat{\mu}_{A}+\frac{1}{3}\hat{\mu}_{A}+\frac{2}{3}\hat{\mu}_{B}-\frac{1}{3}\hat{\mu}_{B}+\frac{1}{3}\hat{\mu}_{C}\\
& = & \frac{1}{3}\hat{\mu}_{A}+\frac{1}{3}\hat{\mu}_{B}+\frac{1}{3}\hat{\mu}_{C}\\
& = & \frac{\hat{\mu}_{A}+\hat{\mu}_{B}+\hat{\mu}_{C}}{3}
\end{eqnarray*}
The final expression indicates that $\hat{\beta}{}_{0}$, the intercept,
represents the simple mean of the group means.
No, there is nothing wrong with doing this. This is sometimes called the 'flat' approach to factorial ANOVA (although I don't know how common that phrasing is). It is sometimes used when there are problems with your data, such as combinations of some levels in which there are no observations. As @Schortchi notes, you should get the same overall $F$-value / test for both models.
Best Answer
Not exactly true. If the categorical predictor with its dummies are your only variables in the model, the intercept is the mean of the missing dummy variable and the other coefficients are the gap between respective groups and the intercept.
In GLMs, the same holds but for the models with a non-identity link function (logistic/Poisson/...), the coefficients are not on the same scale as the original response variable. So you would need to calculate prediction for each group and transform using the mean function to see that this is a model for group means. You cannot transform the coefficients directly.
Once you add additional variables to the model, the coefficients begin to represent adjusted differences between groups, so you now have adjusted group means. These group mean differences are adjusted for the other variables in the model. If your categorical variable is unrelated to these other variables, the coefficients should be relatively stable if using GLMs with identity (as with linear regression) or log link (as with Poisson regression); citation below documents this. If your categorical variable is related with other variables and those variables relate to the outcome, the coefficients should swing about.
By "related to other variables", is there a relationship between the group individuals finds themselves and these other variables in the model? For example, if the categorical predictor is age groups, and another variable in the model is wealth, then age groups will have a relationship to wealth and the coefficients for the age groups should change when wealth is included in the model. This is assuming wealth is related to whatever the outcome variable is.
M. H. GAIL, S. WIEAND, S. PIANTADOSI; Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates, Biometrika, Volume 71, Issue 3, 1 December 1984, Pages 431–444, https://doi.org/10.1093/biomet/71.3.431