Update with a more clear answer:
Here's a plot of all the velocities involved with shock propagation through a sationary medium:
![shock wave velocities relative to incoming air vs. shock mach number](https://i.stack.imgur.com/eFITH.png)
The x axis is the mach number of the shock wave and represents the strength of the shock wave, it could have been velocity or pressure ratio or any other quantity that is monotonic with shock strength.
The y-axis is velocity relative to the still air.
- In solid red we have the velocity of the air entering the shock wave which in this reference frame is still, and thus 0.
- In solid blue we have the velocity of the shock wave.
- In solid green we have the velocity of the air after exiting the shock wave.
In dashed lines I've added to the graph the maximum and minimum velocities that a sound wave could travel (moving with the shock and opposite the shock respectively), but the velocity of a sound wave is relative to the average velocity of the medium it is traveling through, so I've added these line colored according to the medium they are traveling through.
As noted by the OP and the quotation above the velocity of the shock (in blue) is always higher than the velocity of sound in the entering air. However, it's always less than the forward velocity of sound in the exiting medium.
Thus a pressure wave generated by a plane increasing in velocity can propagate to catch up to the shock wave and push it to go even faster. Similarly, if the plane slows down the lower pressure wave can also catch up to the shock wave and slow it down. This is the same propagation mechanism as in longitudinal sound waves.
The fact that the shock wave is traveling faster than sound in the sill medium isn't a problem because the shock wave is being generated and pushed forward by the exiting medium, and relative to the exiting medium the shock wave is traveling at less than the speed of sound.
Change in speed of sound
The fact that the speed of sound changes across the shock wave is irrelevant to this analysis. It was accounted for in the creation of the graph as can be seen by the green dashed lines diverging. However, even if they had not diverged at all the shock wave would still be within the speed of sound in the exiting medium. Similarly, if the speed of sound of the exiting medium was applied to the entering medium, the shock will still fall outside that speed of sound. (Doing this doesn't make physical sense, but is just to demonstrate that the change of speed of sound is irrelevant to answering the question.)
Sudo speeds of sound are dotted (sound velocities traveling in the opposite direction as the shock have been removed for clarity):
![incorrect speeds of sound](https://i.stack.imgur.com/bhlGH.png)
Old Answer
Sound waves travel at the speed of sound relative to the the average velocity of the medium. In the case of a shock wave, the time average velocity of the medium is different on the two sides of the shock wave.
Shock Wave's Perspective
In the frame of reference where the shock wave is stationary, entering medium travels towards the shock wave at super sonic speeds, and exiting medium travels away from the shock wave at sub sonic speeds.
This is the usual frame of reference used to analyze shock waves and is used in shock tables
Exiting Medium's Perspective
In the frame of reference of the exiting medium the shock wave travels outward at sub sonic speeds and the entering medium travels inward at super sonic speeds.
![enter image description here](https://i.stack.imgur.com/DVAWz.png)
Entering Medium's Perspective
Finally, in the frame of reference of the entering medium, the shock wave travels inward at super sonic speed, and the exiting medium exits at a lesser super sonic speed.
This is the frame of reference used in the article as the entering fluid is the atmosphere that the plane is flying through and is thus the assumed rest frame.
Conclusion
The shock wave travels at the speed of sound relative to a weighted average of medium velocity, and is thus not an exception to the rule that wave travel at the speed of sound relative to the average velocity of the medium.
Note that the speed of sound does depend on temperature, and that the temperature changes across a shock wave $a=\sqrt{\gamma\,R\,T}$. However, this effect is not as large as the velocity differences due to the change in reference frame. The figures provided above are to scale using Mach numbers for an entering speed of mach 5. Thus those arrow ignore the change in speed of sound. However, if the changes in speed of sound were accounted for my conclusion would still holds.
Additionally, for high mach numbers the high temperature will cause deviation in the ratio of specific heats resulting in a more complex formula for the speed of sound:
$$a = \sqrt{ R * T * \left(1 + \frac{\gamma - 1}{ 1 + (\gamma-1) * \frac{(\theta/T)^2 * e^{\theta/T} }{\left(e^{\theta/T} -1\right)^2}} \right)}$$
This compensation will actually decrease the amount that the speed of sound is effected by the change in temperature.
A fake derivation
We can rather easily compute a horizontal velocity for the string fi we assume that the total velocity vector is everywhere normal to the string (this assumption is not always valid, see below). The following picture then illustrates the computation:
![enter image description here](https://i.stack.imgur.com/ne5Zi.png)
Take two infinitesimally separated points $x$ and $x+\mathrm{d}x$ and let the wave motion be $\varphi(x,t)$. The vertical/transverse velocity is $v_\text{vert} = \partial_t \varphi(x,t)$, and the horizontal component is $v_\text{hor} = -v_\text{vert}\tan(\vartheta)$, where $\vartheta$ is the angle between the normal and the vertical, and the minus sign is because if we measure $\vartheta$ in the usual counterclockwise direction then the horizonal velocity points to $-x$ for small $\vartheta$. Now $\tan(\vartheta)$ is $\frac{\varphi(x+\mathrm{d}x) - \varphi(x)}{\mathrm{d}x} = \partial_x\varphi(x)$ , so we get
$$ v_\text{hor} = -\partial_t\varphi\partial_x\varphi$$
and if you plug in the sinusoidal solution and take the time average you get exactly the same result as for longitudinal waves. However, you might protext - the transverse wave equation was derived assuming no longitudinal motion, and this computation just blatantly assumes something different.
A Lagrangian derivation
Oddly enough, the result of the above computation is the correct momentum for a pure transverse wave. The Lagrangian of a transverse wave is
$$ L = \frac{1}{2}\rho (\partial_t\varphi)^2 - \frac{1}{2}\tau(\partial_x\varphi)^2$$
and translation invariance gives us a momentum density
$$ T_{xt} = \partial_x L \partial_t \varphi = - \rho\partial_x\varphi\partial_t\varphi$$
which is conserved by Noether's theorem.
The actual answer
In reality, there are no purely transverse waves on a string, there will always be secondary longitudinal waves generated when trying to excite it purely transversely. The "true" momentum of a realistic "transverse" wave is rather half of the theoretical prediction, i.e. $\frac{1}{2}\rho\partial_t\varphi\partial_x\varphi$, for more on this see "The missing wave momentum mystery"[pdf link] by Rowland and Pask.
Best Answer
Your parameter $E$ is the bulk modulus, and this is a measure of how compressible the medium is. Easily compressible media like gases have a low value of $E$ while almost incompressible fluids like water have a very high value for $E$. Actually we should really use the symbol $K$ rather than $E$, because $E$ is normally used for the Young's modulus.
And there is the answer to your question. Water does indeed have a higher density than air (by a factor of about 800) but it is much, much less compressible than air so the value of $E$ is around 20,000 times higher. The end result is that the value of $E/\rho$ is higher in water than air so the speed of sound is greater.
Strictly speaking your equation applies only to gases and liquids. In solids you also need to take account of the shear modulus, and the expression becomes:
$$ v = \sqrt{\frac{K + \tfrac{4}{3}G}{\rho}} $$
where K and G are the bulk modulus and shear modulus respectively. See Why does sound travel faster in iron than mercury even though mercury has a higher density? for more details.