In Example 1.1 of Eisenbud's Commutative Algebra he writes that the symmetric group $\Sigma = S_r$ acts on the polynomial ring $S = k[x_1, \dots, x_r]$ by
$$\sigma(f)(x_1, \dots, x_r) = f(x_{\sigma^{-1}(1)}, \dots, x_{\sigma^{-1}(r)})$$
for $\sigma \in \Sigma$ and $f \in S$. However if I let $\sigma, \tau \in \Sigma$ and $f \in S$, I obtain
$$(\sigma\tau)(f)(x_1, \dots, x_r) = f(x_{\tau^{-1}(\sigma^{-1}(1))}, \dots, x_{\tau^{-1}\sigma^{-1}(r)})$$
but
$$\sigma(\tau(f))(x_1, \dots, x_r) = f(x_{\sigma^{-1}(\tau^{-1}(1))}, \dots, x_{\sigma^{-1}\tau^{-1}(r)}).$$
Is there a mistake somewhere?
Action of symmetric group on polynomial ring
abstract-algebracommutative-algebragroup-actionsinvariant-theorypolynomials
Related Solutions
Let $G\subset S_n$ be the permutation group in question. Let $A[\mathbf{t}]:=A[t_1,\dots,t_n]$ as a shorthand.
General results:
(a) The invariant ring $A[\mathbf{t}]^G$ is generated by the invariants of degree at most $\operatorname{max}(n,n(n-1)/2)$, a result usually attributed to Manfred Gobel (see here), although it was actually anticipated by Leopold Kronecker (see section 12 of his paper Grundzuge einer arithmetischen theorie der algebraischen grossen, Crelle, Journal fur die reine und angewandte Mathematik 92:1-122, 1881, reprinted in Werke, vol. 2, 237–387).
(b) If the coefficient ring $A$ is a field of characteristic not dividing the group order $|G|$, then $A[\mathbf{t}]^G$ is free as a module over the subring generated by any homogeneous system of parameters (equivalently, $A[\mathbf{t}]^G$ is Cohen-Macaulay). This result is not specific to permutation groups -- it is a consequence of the Hochster-Eagon theorem. (Though again it happens that Kronecker proved it in the case of a permutation group and a field of characteristic 0.) Then any homogeneous system of parameters for $A[\mathbf{t}]^G$ is called a set of primary invariants, and a module basis over the subring they generate is a set of secondary invariants. There are algorithms based on Grobner bases to compute primary and secondary invariants, again not specific to permutation groups; see the book by Derksen and Kemper. However, in the case of permutation groups, the elementary symmetric polynomials provide a uniform choice for the primary invariants, and there is a method due to Nicolas Borie that aims for more effective computability of the secondary invariants (see here).
(c) There is also a method due to Garsia and Stanton that produces secondary invariants from a shelling of a certain cell complex (specifically, the quotient of the barycentric subdivision of the boundary of an $(n-1)$-simplex by the $G$ action on the simplex's vertices), when such exists (see here). When this shelling exists, the assumption that $A$ be a field of characteristic not dividing $|G|$ becomes superfluous, i.e. the secondary invariants produced by the method give a module basis for $A[\mathbf{t}]^G$ over the subring generated by the elementary symmetric polynomials, entirely regardless of $A$. It is not an easy problem to find the shelling in general, but has been done in specific cases (the original paper by Garsia and Stanton handles the Young subgroups $Y\subset S_n$ [i.e., direct products of smaller symmetric groups acting on disjoint sets of indices], work of Vic Reiner handles alternating subgroups $Y^+\subset S_n$ of Young subgroups $Y$, and diagonally embedded Young subgroups $Y \hookrightarrow Y\times Y \hookrightarrow S_n\times S_n\subset S_{2n}$, and work of Patricia Hersh handles the wreath product $S_2\wr S_n\subset S_{2n}$). There is a detailed development of Garsia and Stanton's shelling result in my thesis, sections 2.5 and 2.8, along with a discussion of its connection to Gobel's work (see last paragraph) and some speculation about generalizations.
(d) From (b) you can see that $A[\mathbf{t}]^G$ has a nice structure of free-module-over-polynomial-subring when $A$ is a field of characteristic not dividing $|G|$, but from (c) you can see that sometimes this nice structure still exists even when $A$ doesn't satisfy this (e.g. perhaps it is $\mathbb{Z}$, or else a field whose characteristic does divide $|G|$). There is a characterization, due to myself and Sophie Marques, of which groups $G\subset S_n$ have the property that this structure in $A[\mathbf{t}]^G$ exists regardless of $A$. It turns out to be the groups generated by transpositions, double transpositions, and 3-cycles.
(Our paper is framed in the language of Cohen-Macaulay rings and is focused on the situation that $A$ is a field. To see that my claim about "any $A$" in the previous paragraph follows, one shows that if for a given $G$, the described structure obtains for $A$ any field, then it also obtains with $A=\mathbb{Z}$ -- this is supposedly well-known, but "just in case", it is written down carefully in section 2.4.1 of my thesis -- and then one notes that a free module basis of $\mathbb{Z}[\mathbf{t}]^G$ over the subring generated by the elementary symmetric polynomials will also be a free module basis of $A[\mathbf{t}]^G$, just by base changing to $A$. See this MSE question for why the base change doesn't mess anything up.)
(e) As lisyarus stated, the special case of $G=A_n$ is well-understood: the invariant ring is generated by the elementary symmetric polynomials and the Vandermonde determinant. Actually this requires the hypothesis that $2$ is a unit in $A$, as you note in comments. If $2$ is not a unit in $A$, one can still generate the invariant ring with the elementary symmetric polynomials and the sum of the positive terms in the Vandermonde determinant (or, the sum of the negative terms). Certain other cases, e.g. $D_4$, also have explicit descriptions coming from Galois theory. The classical material usually assumes $A$ is a field, but see sections 5.4 and 5.5 in Owen Biesel's thesis for $A_n$ and $D_4$; Biesel is working over general $A$.
I am not sure what you're hoping for in terms of a classification theorem.
(This was really probably more of an MO than an MSE question in the end.)
The problem is that the notation $g^{-1}(x_i)$ looks unambiguous but it's actually not. There are two things you might interpret this to mean and they give two different results.
As usual, things are cleaner if we first don't allow ourselves to work in coordinates. Let $V$ be a finite-dimensional vector space over $k$. There are two polynomial rings you can construct out of $V$, and $GL(V)$ acts on both of them:
- the symmetric algebra $S(V) = \sum_{n \ge 0} S^n(V)$, where $S^n(V)$ is the quotient of the tensor power $V^{\otimes n}$ by the action of $S_n$. The action of $GL(V)$ on $S(V)$ extends the tautological action of $GL(V)$ on $V$ by addition and multiplication.
- the symmetric algebra $S(V^{\ast}) = \sum_{n \ge 0} S^n(V^{\ast})$, whose elements can be interpreted as polynomial functions on $V$ ($S(V)$ is then polynomial functions on $V^{\ast})$. The action of $GL(V)$ on $S(V^{\ast})$ extends the dual action of $GL(V)$ on $V^{\ast}$.
Now we pick a basis $v_1, \dots v_n \in V$. This induces a dual basis $f_1, \dots f_n \in V^{\ast}$ defined by $f_i(v_j) = \delta_{ij}$, and these bases allow us to identify our symmetric algebras as polynomial algebras
$$S(V) \cong k[v_1, \dots v_n]$$ $$S(V^{\ast}) \cong k[f_1, \dots f_n].$$
If we now use the basis $v_i$ to identify $GL(V)$ with $GL_n(k)$, then $g \in GL_n(k)$ acts on a sum $\sum c_i v_i \in V$ via matrix multiplication of $g$ on the column vector with entries $c_i$ in the usual way. It also acts on a sum $\sum r_i f_i \in V^{\ast}$, but now via multiplication by the matrix $(g^T)^{-1}$ (note the transpose); said another way, it acts via the action of $g^{-1}$ on the row vector with entries $r_i$.
So, explicitly, if $g^{-1} = \left[ \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right]$, then there are two ways to interpret the meaning of $g^{-1}(x_1)$, depending on whether you're thinking of $x_1$ as $v_1$ or as $f_1$:
- As $v_1$: then we identify $x_1$ with the column vector $\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]$, which gives $g^{-1}(x_1) = a_{11} x_1 + a_{21} x_2$.
- As $f_1$: then we identify $x_1$ with the row vector $\left[ \begin{array}{cc} 1 & 0 \end{array} \right]$, which gives $g^{-1}(x_1) = a_{11} x_1 + a_{12} x_2$.
If you choose the covariant interpretation ($x_1 = v_1$), which is implied by "general linear group of $\text{span}(x_1, \dots x_n)$," then you don't need to put an inverse in the definition of the action, but you do need to be careful about how you define it. Safest to say the coordinate-invariant thing.
Best Answer
I remember being confused about this too. The map that Eisenbud defines is indeed not a left action of $S_r$ on $k[x_1,\dots,x_r]$, since as noted in your question, $$\sigma(\tau(f))(x_1,\dots,x_r)=f\bigl(x_{\sigma^{-1}(\tau^{-1}(1))},\dots,x_{\sigma^{-1}(\tau^{-1}(r))}\bigr) \, .$$ To prove this, let $g=\tau(f)$. Then, $$ \sigma(g)(x_1,\dots,x_r)=g\bigl(x_{\sigma^{-1}(1)},\dots,x_{\sigma^{-1}(r)}\bigr) \, . $$ Writing $y_i$ for $x_{\sigma^{-1}(i)}$, we see that $$ g(y_1,\dots,y_r)=\tau(f)(y_1,\dots,y_r)=f\bigl(y_{\tau^{-1}(1)},\dots,y_{\tau^{-1}(r)}\bigr) \, , $$ and since $y_{\tau^{-1}(i)}=x_{\sigma^{-1}(\tau^{-1}(i))}$ for all $i$, the result follows. One way to repair this issue is by instead setting $$ \sigma(f)(x_1,\dots,x_r)=f(x_{\sigma(1)},\dots,x_{\sigma(r)}) \, . $$ This defines a left action of $S_r$ on $k[x_1,\dots,x_r]$. Alternatively, we can view Eisenbud's map as a right action on $k[x_1,\dots,x_r]$.