If we divide through by $|x_1 - x_2|$, we get
$$\left|\frac{f(x_1) - f(x_2)}{x_1 - x_2}\right| \leq |x_1-x_2|,$$
that is, the slope of the secant line between any two points is at most the distance between them. Fixing $x_1 = x$, taking $x_2 = x+h$ and letting $h$ approach zero shows that $f$ is differentiable at $x$ and $f'(x) = 0$. That is, $f' \equiv 0$, so by the Mean Value Theorem $f$ is constant.
The proof goes through with the right hand side of your inequality replaced by $o(|x_1-x_2|)$, so in particular if there is $\alpha > 1$ and $C > 0$ such that for all $x_1,x_2 \in \mathbb{R}$, $|f(x_1) - f(x_2)| \leq C |x_1 - x_2|^{\alpha}$. If instead we take $\alpha = 1$ we get a Lipschitz continuous function. If we take $\alpha \in (0,1)$ we get a Hölder continuous function. Such functions need not be constant, but are still very nice.
And now, an anecdote: last summer my department held a "mock AMS conference" in which all summer-supported graduate students presented short talks, the more senior of them tending to talk about their thesis work in progress. One student gave an exceptionally clear and audience-friendly talk about her work on convex subsets satisfying certain smoothness conditions on the boundary. She mentioned the prospect of proving a result for Hölder continuous boundary for a certain class of exponents $\alpha \leq 1$. Casting about for a question, I decided to ask about the case of $\alpha > 1$...at which point her thesis adviser, who was sitting next to me in the audience, very politely explained the facts of life about Hölder continuous functions with exponent $\alpha > 1$. Oops!
The sum
$$\sum_{g\in \mathbf{G}(n,d)} \mathbf{X}_g
$$
is what you get when you expand a multinomial ($d$-nomial) raised to the $n^\text{th}$ power without assuming commutativity.
The sum that appears in your application doesn't arise from the expansion of a power, but is otherwise similar. I think the smallest non-trivial case ($d=2$, $n=2$) illustrates what needs to be understood. On the left we have a sum
$$
\sum_{g\in {\bf \large G}(n,d)} {\bf A}_g^* {\bf A}_{g}=A_1^*A_1^*A_1A_1
+A_1^*A_2^*A_1A_2
+A_2^*A_1^*A_2A_1
+A_2^*A_2^*A_2A_2,
$$
in which the order of the operators in each product (in both the left half of the product, containing the operators $A_i*$, and in the right half, containing the operators $A_i$) is not necessarily lexicographic. That is, $A_2$ sometimes occurs to the left of $A_1$ (and $A_2^*$ to the left of $A_1^*$). On the right we have a sum
$$
\sum_{|\alpha|=n}\frac{n!}{\alpha!}{{\bf A}^*}^{\alpha}{\bf A}^{\alpha}=\frac{2!}{2!\,0!}A_1^*A_1^*A_1A_1+\frac{2!}{1!\,1!}A_1^*A_2^*A_1A_2+\frac{2!}{0!\,2!}A_2^*A_2^*A_2A_2,
$$
in which the order of the operators in each product (again treating the half containing the $A_i^*$ separately from the half containing the $A_i$) is lexicographic. That is, $A_2$ always occurs to the right of $A_1$ (and $A_2^*$ to the right of $A_1^*$). We get from one side to the other by "combining like terms". It is commutativity of the operators that allows us to do this. The numerical coefficients in the sum on the right tell us how many "like terms" on the left contributed to each term on the right.
What needs to be shown is that these numerical coefficients have been chosen correctly, that is, that the number of functions from $\{1,2,\ldots,n\}$ to $\{1,2,\ldots,d\}$ in which $k\in\{1,2,\ldots,d\}$ occurs as the image of some $j\in\{1,2,\ldots,n\}$ exactly $\alpha_k$ times is given by $\frac{n!}{\alpha_1!\,\alpha_2!\,\ldots\,\alpha_d!}$.
Added: Since you ask about the case $d=3$, let's use that as an example. If
$$
(X_1+X_2+X_3)^n=(X_1+X_2+X_3)(X_1+X_2+X_3)\ldots(X_1+X_2+X_3)\quad\text{($n$ factors)}
$$
is expanded, without assuming commutativity of the $X_i$ we get $3^n$ terms. These terms are in one-to-one correspondence with the functions from $\{1,2,\ldots,n\}$ to $\{1,2,3\}$. So, for example, if $n=4$, one of the terms is $X_2X_3X_2X_3$. It corresponds to the function $1\mapsto2$, $2\mapsto3$, $3\mapsto2$, $4\mapsto3$.
If we use commutativity, $X_2X_3X_2X_3$ equals $X_2^2X_3^2$. But $X_2X_3X_2X_3$ is not the only term that equals $X_2^2X_3^2$: any term with two factors $X_2$ and two factors $X_3$ also equals $X_2^2X_3^2$. In the notation of the proof you adapted, $\alpha_2=\alpha_3=2$ and $\alpha_1=\alpha_4=0$ for the term $X_2X_3X_2X_3$. Furthermore, because of commutativity, $X_2X_3X_2X_3=\mathbf{X}^\alpha=X_1^0X_2^2X_3^2X_4^0=X_2^2X_3^2$. So $X_2X_3X_2X_3$ is one of the terms on the left that contributes to the term on the right containing $X_2^2X_3^2$. Other terms that contribute are other terms with $\alpha_2=\alpha_3=2$ and $\alpha_1=\alpha_4=0$ such as $X_2X_2X_3X_3$, $X_3X_2X_2X_3$, and so on. There are six such terms in all since there are $6=\frac{4!}{2!\,2!}$ ways to arrange the factors of $X_2X_3X_2X_3$.
The main difference between this standard multinomial expansion and your application is that your left-hand side cannot be written as an $n$-fold product. Instead, your left-hand side is defined directly in terms of the functions from $\{1,2,\ldots,n\}$ to $\{1,2,3\}$. Most everything else works as before. So one of the terms on the left will be $A_2^*A_3^*A_2^*A_3^*A_2A_3A_2A_3$. It can be rearranged using commutativity to get $(A_2^*)^2(A_3^*)^2A_2^2A_3^2$. There are again six terms in total on the left that contribute to $(A_2^*)^2(A_3^*)^2A_2^2A_3^2$.
Second addition: This is a response to the request in the comments for a completely written-out proof and for an explanation of where $\alpha_k$ is used in the proof.
Your proof is complete. The issue may be that the notation used is rather concise, and may require some "mathematical maturity" to decode. In your proof, $\alpha_k$ is used when you write "And because $X_iX_j=X_jX_i$ for all $i,j\in\{1,\cdots,n\}$ then $\mathbf{X}_g=\mathbf{X}^\alpha$ for some $\alpha\in \mathbb{N}^d$ such that $|\alpha|=n$." To translate this to my example, the term $X_2X_3X_2X_3$ corresponds to the function $g$ defined by $g(1)=2$, $g(2)=3$, $g(3)=2$, $g(4)=3$. The statement $\alpha_2=\alpha_3=2$, $\alpha_1=\alpha_4=0$ in my answer follows from your definition of $\alpha_k$, since there are two elements of $\{1,2,3,4\}$ such that $g(j)=2$ (namely $j=1$ and $j=3$), two elements such that $g(j)=3$ (namely $j=2$ and $j=4$), and no elements such that $g(j)=1$ or $g(j)=4$. So your statement that I quoted is correct: $$\mathbf{X}_g=X_{g(1)}X_{g(2)}X_{g(3)}X_{g(4)}=X_2X_3X_2X_3=X_2^2X_3^2=X_1^0X_2^2X_3^2X_4^0=X^{\alpha(1)}X^{\alpha(2)}X^{\alpha(3)}X^{\alpha(4)}=\mathbf{X}^\alpha.$$
I might suggest some changes in wording. In the line I quote from your proof, I would replace the phrase "for some $\alpha\in \mathbb{N}^d$ such that $|\alpha|=n$" with "for $\alpha$ as defined above, which satisfies $\alpha\in \mathbb{N}^d$ and $|\alpha|=n$", and in the next line of your proof, I would replace "But for each $\alpha$" with "But for each $\alpha$ satisfying these conditions".
Best Answer
Yes, they are related. The coefficients in the case bracket account for the fact that you require $p\leqslant q \leqslant r$, whereas the usual formula for the multinomial theorem makes no such assumption about $\alpha_1$, $\alpha_2$ and $\alpha_3$.
If $p$, $q$ and $r$ are pairwise distinct, the coefficient $\binom{n}{p,q,r}$ would appear $3! = 6$ times in the multinomial formula, corresponding to the ways of assigning the values of $p$, $q$ and $r$ to $\alpha_1$, $\alpha_2$ and $\alpha_3$.
Similarly, if $p=q\neq r$, the coefficient $\binom{n}{p,q,r}$ would appear $3$ times in the multinomial formula, corresponding to the ways of assigning the values of $p$, $q$ and $r$ to $\alpha_1$, $\alpha_2$ and $\alpha_3$.
Finally, if they are all the same, the coefficient appears only once, as there is a signle way to assign the values.
So, aside from the restriction that $r$ be odd, $3!$ times your expression would be another representation of the expression you get from the multinomial theorem for $(1+1+1)^n$. Put another way, your expression equals
$$ \frac1{3!}\sum_{\genfrac{}{}{0pt}{1}{p+q+r =n}{\max\{p,q,r\} \text{ is odd}}} \binom{n}{p,q,r} $$