The sum
$$\sum_{g\in \mathbf{G}(n,d)} \mathbf{X}_g
$$
is what you get when you expand a multinomial ($d$-nomial) raised to the $n^\text{th}$ power without assuming commutativity.
The sum that appears in your application doesn't arise from the expansion of a power, but is otherwise similar. I think the smallest non-trivial case ($d=2$, $n=2$) illustrates what needs to be understood. On the left we have a sum
$$
\sum_{g\in {\bf \large G}(n,d)} {\bf A}_g^* {\bf A}_{g}=A_1^*A_1^*A_1A_1
+A_1^*A_2^*A_1A_2
+A_2^*A_1^*A_2A_1
+A_2^*A_2^*A_2A_2,
$$
in which the order of the operators in each product (in both the left half of the product, containing the operators $A_i*$, and in the right half, containing the operators $A_i$) is not necessarily lexicographic. That is, $A_2$ sometimes occurs to the left of $A_1$ (and $A_2^*$ to the left of $A_1^*$). On the right we have a sum
$$
\sum_{|\alpha|=n}\frac{n!}{\alpha!}{{\bf A}^*}^{\alpha}{\bf A}^{\alpha}=\frac{2!}{2!\,0!}A_1^*A_1^*A_1A_1+\frac{2!}{1!\,1!}A_1^*A_2^*A_1A_2+\frac{2!}{0!\,2!}A_2^*A_2^*A_2A_2,
$$
in which the order of the operators in each product (again treating the half containing the $A_i^*$ separately from the half containing the $A_i$) is lexicographic. That is, $A_2$ always occurs to the right of $A_1$ (and $A_2^*$ to the right of $A_1^*$). We get from one side to the other by "combining like terms". It is commutativity of the operators that allows us to do this. The numerical coefficients in the sum on the right tell us how many "like terms" on the left contributed to each term on the right.
What needs to be shown is that these numerical coefficients have been chosen correctly, that is, that the number of functions from $\{1,2,\ldots,n\}$ to $\{1,2,\ldots,d\}$ in which $k\in\{1,2,\ldots,d\}$ occurs as the image of some $j\in\{1,2,\ldots,n\}$ exactly $\alpha_k$ times is given by $\frac{n!}{\alpha_1!\,\alpha_2!\,\ldots\,\alpha_d!}$.
Added: Since you ask about the case $d=3$, let's use that as an example. If
$$
(X_1+X_2+X_3)^n=(X_1+X_2+X_3)(X_1+X_2+X_3)\ldots(X_1+X_2+X_3)\quad\text{($n$ factors)}
$$
is expanded, without assuming commutativity of the $X_i$ we get $3^n$ terms. These terms are in one-to-one correspondence with the functions from $\{1,2,\ldots,n\}$ to $\{1,2,3\}$. So, for example, if $n=4$, one of the terms is $X_2X_3X_2X_3$. It corresponds to the function $1\mapsto2$, $2\mapsto3$, $3\mapsto2$, $4\mapsto3$.
If we use commutativity, $X_2X_3X_2X_3$ equals $X_2^2X_3^2$. But $X_2X_3X_2X_3$ is not the only term that equals $X_2^2X_3^2$: any term with two factors $X_2$ and two factors $X_3$ also equals $X_2^2X_3^2$. In the notation of the proof you adapted, $\alpha_2=\alpha_3=2$ and $\alpha_1=\alpha_4=0$ for the term $X_2X_3X_2X_3$. Furthermore, because of commutativity, $X_2X_3X_2X_3=\mathbf{X}^\alpha=X_1^0X_2^2X_3^2X_4^0=X_2^2X_3^2$. So $X_2X_3X_2X_3$ is one of the terms on the left that contributes to the term on the right containing $X_2^2X_3^2$. Other terms that contribute are other terms with $\alpha_2=\alpha_3=2$ and $\alpha_1=\alpha_4=0$ such as $X_2X_2X_3X_3$, $X_3X_2X_2X_3$, and so on. There are six such terms in all since there are $6=\frac{4!}{2!\,2!}$ ways to arrange the factors of $X_2X_3X_2X_3$.
The main difference between this standard multinomial expansion and your application is that your left-hand side cannot be written as an $n$-fold product. Instead, your left-hand side is defined directly in terms of the functions from $\{1,2,\ldots,n\}$ to $\{1,2,3\}$. Most everything else works as before. So one of the terms on the left will be $A_2^*A_3^*A_2^*A_3^*A_2A_3A_2A_3$. It can be rearranged using commutativity to get $(A_2^*)^2(A_3^*)^2A_2^2A_3^2$. There are again six terms in total on the left that contribute to $(A_2^*)^2(A_3^*)^2A_2^2A_3^2$.
Second addition: This is a response to the request in the comments for a completely written-out proof and for an explanation of where $\alpha_k$ is used in the proof.
Your proof is complete. The issue may be that the notation used is rather concise, and may require some "mathematical maturity" to decode. In your proof, $\alpha_k$ is used when you write "And because $X_iX_j=X_jX_i$ for all $i,j\in\{1,\cdots,n\}$ then $\mathbf{X}_g=\mathbf{X}^\alpha$ for some $\alpha\in \mathbb{N}^d$ such that $|\alpha|=n$." To translate this to my example, the term $X_2X_3X_2X_3$ corresponds to the function $g$ defined by $g(1)=2$, $g(2)=3$, $g(3)=2$, $g(4)=3$. The statement $\alpha_2=\alpha_3=2$, $\alpha_1=\alpha_4=0$ in my answer follows from your definition of $\alpha_k$, since there are two elements of $\{1,2,3,4\}$ such that $g(j)=2$ (namely $j=1$ and $j=3$), two elements such that $g(j)=3$ (namely $j=2$ and $j=4$), and no elements such that $g(j)=1$ or $g(j)=4$. So your statement that I quoted is correct: $$\mathbf{X}_g=X_{g(1)}X_{g(2)}X_{g(3)}X_{g(4)}=X_2X_3X_2X_3=X_2^2X_3^2=X_1^0X_2^2X_3^2X_4^0=X^{\alpha(1)}X^{\alpha(2)}X^{\alpha(3)}X^{\alpha(4)}=\mathbf{X}^\alpha.$$
I might suggest some changes in wording. In the line I quote from your proof, I would replace the phrase "for some $\alpha\in \mathbb{N}^d$ such that $|\alpha|=n$" with "for $\alpha$ as defined above, which satisfies $\alpha\in \mathbb{N}^d$ and $|\alpha|=n$", and in the next line of your proof, I would replace "But for each $\alpha$" with "But for each $\alpha$ satisfying these conditions".
Yes, they are related. The coefficients in the case bracket account for the fact that you require $p\leqslant q \leqslant r$, whereas the usual formula for the multinomial theorem makes no such assumption about $\alpha_1$, $\alpha_2$ and $\alpha_3$.
If $p$, $q$ and $r$ are pairwise distinct, the coefficient $\binom{n}{p,q,r}$ would appear $3! = 6$ times in the multinomial formula, corresponding to the ways of assigning the values of $p$, $q$ and $r$ to $\alpha_1$, $\alpha_2$ and $\alpha_3$.
Similarly, if $p=q\neq r$, the coefficient $\binom{n}{p,q,r}$ would appear $3$ times in the multinomial formula, corresponding to the ways of assigning the values of $p$, $q$ and $r$ to $\alpha_1$, $\alpha_2$ and $\alpha_3$.
Finally, if they are all the same, the coefficient appears only once, as there is a signle way to assign the values.
So, aside from the restriction that $r$ be odd, $3!$ times your expression would be another representation of the expression you get from the multinomial theorem for $(1+1+1)^n$.
Put another way, your expression equals
$$
\frac1{3!}\sum_{\genfrac{}{}{0pt}{1}{p+q+r =n}{\max\{p,q,r\} \text{ is odd}}}
\binom{n}{p,q,r}
$$
Best Answer
The reason why there is a factor $\frac{n!}{\alpha!}$ is that with the multi-index notation, the order with which the operator components are composed is undistinguished. For instance, if $d=2$, then since $T_iT_j=T_jT_i$ for all $i,j$, we have $$T_1T_2=T^{(1,1)}=T_2T_1 $$ Therefore, you have $$\|T^n\|^2=\sum_{|\alpha|=n}c_{\alpha}\|T^{\alpha}\|^2 $$ where $c_{\alpha}$ is the number of possible permutations of a composition of $n$ components of $T=(T_1,\dots,T_d)$ which correspond to the multi-index $\alpha$.
\begin{align*}c_{\alpha}&=\#\left\{(j_1,\dots,j_n):T_{j_1}\dots T_{j_n}=T^{\alpha},\,j_k\in \left\{0,\dots,d\right\}\right\} = \\ &=\#\left\{(j_1,\dots,j_n):(\#\left\{j_k=1\right\}=\alpha_1 \land \dots \land \#\left\{j_k=d\right\}=\alpha_d)\right\}\\ \end{align*} To compute the cardinality of the above set in an intuitive way, we may write out the multi-index $\alpha$ in the following way $$\underbrace{1,\dots, 1}_{\alpha_1\text{ times }},\underbrace{2,\dots, 2}_{\alpha_2\text{ times }},\dots, \underbrace{d,\dots, d}_{\alpha_d\text{ times }} $$ Notice that the total amount of numbers written in the above line is $\alpha_1+\dots+\alpha_d=n$. Then $c_{\alpha}$ is simply the number of possible permutations of this list, where copies of the same number are undistinguished. And this is just $$ c_{\alpha}=\frac{n!}{\alpha_1!\cdot \dots \cdot \alpha_d!}=\frac{n!}{\alpha!}$$