$T^{ij}$ is nothing more or less than the flow of $i$-momentum across surfaces of constant $j$.^{1} As a result, the force exerted across a surface $S$ with unit normal one-form components $n_j$ has components
$$ F_{(S)}^i = \int\limits_S T^{ij} n_j \,\mathrm{d}^{d-1}x $$
in $d$ dimensions.

The argument for symmetry is not that the cube is static. The argument is that the cube cannot have infinite angular acceleration as its size shrinks. That is, because we are dealing with a continuous fluid, it should be well behaved as we take our region to be arbitrarily small.

Consider a two-dimensional example of a square covering the area $x_0-L/2 < x < x_0+L/2$, $y_0-L/2 < y < y_0+L/2$. The left surface $x = x_0-L/2$ has $n^\mathrm{left}_j = \delta_j^x$ (sign chosen to correspond to flow of momentum into the square), and so the force on our square due to interactions across the left face has components $F_\mathrm{left}^i \approx L T_\mathrm{left}^{ix}$, where $T_\mathrm{left}^{ix}$ is $T^{ix}$ evaluated at the midpoint $(x_0-L/2,y_0)$. On the opposite surface, $n^\mathrm{right}_j = -\delta_j^x$, so $F_\mathrm{right}^i \approx -L T_\mathrm{right}^{ix}$. Similarly, $F_\mathrm{bottom}^i \approx L T_\mathrm{bottom}^{iy}$ and $F_\mathrm{top}^i \approx -L T_\mathrm{top}^{iy}$.

Torque is a $(d-2)$-form: $\tau = {}^*(\tilde{r} \wedge \tilde{F})$, with $\tilde{r}$ and $\tilde{F}$ the one-forms corresponding to displacement $\vec{r}$ and force $\vec{F}$. In 2D, $\tau = \epsilon_{ij} (r^i F^j - r^j F^i)$. If force components $F_\mathrm{left}^i$ are applied at $(x_0-L/2,y_0)$, then $r_\mathrm{left}^i = -(L/2) \delta^i_x$ and $\tau_\mathrm{left} \approx -(1/2) L^2 T^{yx}$. You can also check $\tau_\mathrm{right} \approx -(1/2) L^2 T^{yx}$ and $\tau_\mathrm{bottom} \approx \tau_\mathrm{top} \approx (1/2) L^2 T^{xy}$.

As we shrink the square down, the midpoints at which we evaluate $T^{ij}$ approach one another and we find the total torque is $\tau \approx L^2 (T_\mathrm{center}^{xy} - T_\mathrm{center}^{yx})$. However, the moment of inertia for a square of surface density $\sigma$ is $\sigma L^4/6$. Thus angular acceleration is
$$ \alpha = \lim_{L\to0} \frac{6(T^{xy}-T^{yx})}{\sigma L^2} $$
at any point in the fluid. Thus we must have $T^{xy} = T^{yx}$, in order to avoid $\alpha \to \infty$. Note that this argument holds in higher dimensions, in more general settings than fluids, and for more general geometries/spacetimes.^{2}

The argument does not however hold in the linear acceleration case. For example, the net $x$-force will have terms like $L (T_\mathrm{left}^{xx} - T_\mathrm{right}^{xx})$ and $L (T_\mathrm{bottom}^{xy} - T_\mathrm{top}^{xy})$. Even though mass is $\sigma L^2$, which would seem to imply linear acceleration goes as $1/L$, the fact is the pairs of stress tensor components naturally cancel as they are evaluated at the same point ($T_\mathrm{left}, T_\mathrm{right}, T_\mathrm{bottom}, T_\mathrm{top} \to T_\mathrm{center}$). No constraints are imposed from this consideration.

^{1}I avoid saying "in the $j$-direction, since we are really interested in surfaces, and these are characterized by one-forms, not vectors. This is more apparent in non-Cartesian (better still non-diagonal) coordinate systems.

^{2}This symmetry always holds, for any material or field, as long as momentum is conserved. You occasionally see reference to the antisymmetric part of the stress tensor, but this comes from splitting the physics into separate domains, and pretending that momentum is lost when going from one to another (e.g. torques can transfer angular momentum from bulk flow into particle spins, and we choose to treat the latter as some momentum-conservation-violating sink as far as the continuum-modeled fluid is concerned).

The usual picture you see in wkipedia and other sources is indeed over-simplified. Viscosity resists more general velocity gradients, not just pure shear flows $\nabla_x v_y\neq 0$. For example, if you have a compressible fluid undergoing non-isotropic scaling expansion
$$
\vec{v} = (\alpha x,\beta y,\gamma z)
$$
then shear viscosity will try to equalize the expansion rates $\alpha\simeq \beta \simeq \gamma$. Bulk viscosity will resist the overall expansion of the fluid.

You can view you example
$$
\vec{v} = (x,0,0) = \frac{1}{3} \left( (2x,-y,-z)+(x,y,z) \right)
$$
as a linear combination of anisotropic shear flow, and pure expansion. Shear viscosity counteracts the first, and bulk viscosity the second term.

## Best Answer

The definition of the stress tensor is (in Einstein summation notation):

$$ \tau_{ij} = \mu \left(\frac{\partial u_i}{\partial x_j}+\frac{\partial u_j}{\partial x_i} \right)$$

So, if you look at $\tau_{ii}$, you get $2\frac{\partial u_i}{\partial x_i}$. That's really where the factors come from, not from "averaging" over any fluid element or anything like that. It's just due to the symmetry of the tensor.