Lemma: If $z$ is a root of $p(z)=\sum_{i=0}^n a_i z^i$ where $z$ is non-zero, then $z^{-1}$ is a root of $q(z) = \sum_{i=0}^n a_{n-i}z^i$.
Proof:
Suppose $z$ is a root of $p$, $$\sum_{i=0}^n a_iz^i=0,$$ we divide by $z^n$, then we have
$$\sum_{i=0}^n a_iz^{i-n}=\sum_{i=0}^n a_i(z^{-1})^{n-i}=0.$$
Let $n-i=j$, $$\sum_{j=0}^n a_{n-j}(z^{-1})^{j}=0$$
That is $z^{-1}$ is a root of $q$.
What the lemma says is given a polynomial with $z$ as a non-zero root, then $z^{-1}$ is a root for the polynomial obtained by reversing the coefficient order. Let's apply this result on our problem.
$$f(z)=z^{2021}+z-1$$
If $z$ is a root of $f$, then $z+1$ is a root of
$$g(z)= f(z-1)=(z-1)^{2021}+(z-1)-1=\sum_{i=0}^{2021}a_iz^i$$
where $a_0=-3, a_{1}=2021+1=2022$.
Hence, $$\sum \frac1{z_i+1}=-\frac{a_1}{a_0}=\frac{2022}{3}=674$$
Fixing $z_1$ and $z_2$, the locus of $z_3$ for $L \stackrel{def}{=}|z_1-z_2|^2 + |z_2 - z_3|^2 + |z_3 - z_1|^2 = \text{constant}$ are circles centered at mid-point of $z_1,z_2$. For $z_3$ on the circle $|z| = r_3$ to maximize $L$, the locus circle need to tangent to circle $|z| = r_3$ at $z_3$. This implies the origin $0$ lies on the median of $\triangle z_1z_2 z_3$ (a line passing through $z_3$ and mid-point of $z_1,z_2$).
Apply same argument to other two pairs $z_2,z_3$ and $z_3,z_1$,
we find for the configuration where $L$ is maximized, the origin $0$ lies on the intersection of all three medians of $\triangle z_1z_2z_3$. If $\triangle z_1z_2z_3$ is non-degenerate (ie. $z_1$, $z_2$, $z_3$ are not collinear), the intersection is a single point and $0$ is centroid of $\triangle z_1z_2z_3$. In this case, $z_1 + z_2 + z_3 = 0$.
Update
To summarize, in order to maximize $L$, there are only two possible scenarios:
$z_1, z_2, z_3$ are collinear. In this case, it is trivial to see
$L$ is maximized when $z_1,z_2$ is on opposite side of $z_3$ with respect to origin. A configuration that maximize $L$ is $(z_1,z_2,z_3) = (r_1,r_2,-r_3)$ with corresponding
$$\begin{align}
L = L_{1} &\stackrel{def}{=} (r_1+r_3)^2 + (r_2+r_3)^2 + (r_1-r_2)^2\\
&= 3(r_1^2+r_2^2+r_3^2) - (r_1+r_2-r_3)^2\end{align}
$$
Otherwise, $z_1, z_2, z_3$ are not collinear and $z_1 + z_2 + z_3 = 0$ when $L$ is maximized. By a little bit of algebra, we have
$$L = L_{2} \stackrel{def}{=} 3(r_1^2+r_2^2+r_3^2)$$
The million dollar questions is
which scenarios give us the true maximum?
Notice $r_1 < r_2 < r_3$. When $r_3 > r_1 + r_2$. It is impossible to find $z_1, z_2, z_3$ on the circles to satisfy $z_1 + z_2 + z_3 = 0$. This rule out the $2^{nd}$ scenario and maximum $L$ is $L_1$.
On the other direction, let's say $r_3 < r_1 + r_2$. Among all configuration where $z_1, z_2, z_3$ is collinear, $(z_1,z_2,z_3) = (r_1,r_2,-r_3)$
remains to be a configuration that maximizes $L$.
Consider what happens if we keep $z_3$ fixed, increase the imaginary part of $z_1$ and $-z_2$ by a small $\epsilon$. To second order of $\epsilon$, the real part of $z_1$, $z_2$ will be decreased by amounts
$\frac{\epsilon^2}{2r_1}$ and $\frac{\epsilon^2}{2r_2}$ respectively. Plug this into expression of $L$. To second order of $\epsilon$ again, $L$ will be changed by an amount:
$$\begin{align}
\delta L &=
\underbrace{2(r_3+r_1)\frac{-\epsilon^2}{2r_1}}_{\delta((x_1-x_3)^2)}
+
\underbrace{2(r_3+r_2)\frac{-\epsilon^2}{2r_2}}_{\delta((x_2-x_3)^2)}
+
\underbrace{2(r_2-r_1)\left(\frac{\epsilon^2}{r_1} - \frac{\epsilon^2}{r_2}\right)}_{\delta((x_1-x_2)^2)}
+ 6\epsilon^2 + o(\epsilon^2)\\
&= \frac{(r_1+r_2-r_3)(r_2+r_1)}{r_1r_2} \epsilon^2 + o(\epsilon^2)
\end{align}
$$
When $r_1 + r_2 > r_3$, the coefficient of $\epsilon^2$ in $\delta L$ is positive. This means the "maximum" from $1^{st}$ scenario cannot be the true maximum. The true maximum is $L_2$, the one from $2^{nd}$ scenario.
Combine all these, we have
$$\max L = 3(r_1^2+r_2^2+r_3^2) -
\begin{cases}(r_1 + r_2 - r_3)^2, & r_3 > r_1 + r_2\\0, &\text{ otherwise}
\end{cases}$$
Best Answer
Zero-set
You have correctly identified two transformations which send roots of $f$ to roots of $f$. Explicitly, setting $u(z)=z^2$ and $v(z)=(z-1)^2$, we have $u(S)\subset S$ and $v(S)\subset S$. Let's examine what this tells us about $S$.
First, note that $u(S)\subset S$ implies that $|z|\in\{0,1\}$ for every $z\in S$. Suppose that $S$ contains roots with absolute value in $(0,1)$ and let $w$ be a root with the smallest absolute value in $(0,1)$. Then $|w|^2<|w|$. The contradiction implies that no $w\in S$ has absolute value in $(0,1)$. Analogous argument shows that $S$ does not contain any elements with absolute value greater than $1$.
Second, note that the first fact above together with $v(S)\subset S$ imply that $S\subset\{0,1\}$. We have $v(v(S))\subset S$ and $$ ((z-1)^2-1)^2=(z^2-2z+1-1)^2=z^2(z-2)^2.\tag1 $$ However, the unit circle has diameter two, so the only $z$ with $|z|\in\{0,1\}$ that also satisfy $|z|\cdot|z-2|\in\{0,1\}$ are $z=0$ and $z=1$. Therefore, $S\subset\{0,1\}$.
Leading coefficient
Let $a$ denote the leading coefficient of $f(x)$. The functional equation implies that $a+a^2=0$. Therefore, $a=-1$.
Solution
The facts above imply that $f$ is of the form $f(x)=-x^n(x-1)^m$. Substituting into our functional equation we obtain $$ \begin{align} f(x^2)+f(x)f(x+1)=0\tag2\\ -x^{2n}(x^2-1)^m+x^n(x-1)^m(x+1)^nx^m=0\tag3\\ -x^{2n}(x-1)^m(x+1)^m+x^{n+m}(x-1)^m(x+1)^n=0\tag4\\ \end{align} $$ which implies that $n=m$. Therefore, the set of polynomial solutions of the functional equation consists of $f(x)\equiv 0$ and $f(x)=-x^n(x-1)^n$ for $n=0,1,2,\dots$.