Certainly the strategy for each player must be of the form to reroll below a certain threshold. To find the threshold, we can start from Ross’ approximate solution and iteratively adjust the strategies until they're in equilibrium.
So assume that $B$ rerolls below $6$ and $A$ rerolls below $9$. Then every number of $B$ below $6$ has probability $\frac12\cdot\frac1{10}=\frac1{20}$, and every number $6$ or higher has probability $\frac1{10}+\frac12\cdot\frac1{10}=\frac3{20}$. Every number of $A$ below $9$ has probability $\frac8{20}\cdot\frac1{20}=\frac1{50}$ and every number $9$ or higher has probability $\frac1{20}+\frac8{20}\cdot\frac1{20}=\frac7{100}$.
If $B$ were to adjust her strategy by keeping $5$, she would get a winning chance of $5\cdot\frac1{50}=\frac1{10}$ instead of the current winning chance of $\frac1{10}\sum_{k=1}^{10}\frac k{50}+3\cdot\frac1{10}\left(\frac7{100}-\frac1{50}\right)$, where the second term corrects for the $9$ and $10$ that are incorrectly included in the sum in the first term. This is $\frac{10(10+1)}{2\cdot10\cdot50}+\frac3{200}=\frac{11}{100}+\frac3{200}=\frac{25}{200}=\frac18$. Thus, there is no gain in this strategy change.
On the other hand, if she were to adjust her strategy by rerolling $6$, she would give up the current winning probability of $6\cdot\frac1{50}=\frac3{25}$ to get the above winning probability upon rerolling, $\frac18$. Since this is slightly larger than $\frac3{25}$, she should switch strategies and reroll $6$. (The reason this didn't show up in Ross’ argument is that given $A$'s strategy, $9$ and $10$ are far more likely than the other numbers, so it makes sense for $B$ to try harder to beat them even though it lowers $B$'s average roll.)
On the other hand, keeping $7$ yields a winning probability of $7\cdot\frac1{50}=\frac7{50}\gt\frac18$, so $B$ shouldn't reroll $7$. Thus $B$’s best response to $A$’s current strategy is to reroll $6$ or lower. The probability for $B$ to end up with a number below $7$ will then be $\frac6{10}\cdot\frac1{10}=\frac3{50}$, and the probability for a number $7$ or higher will be $\frac1{10}+\frac6{10}\cdot\frac1{10}=\frac4{25}$.
Now we should check whether $A$’s strategy is the best reponse to $B$’s new strategy. If $A$ were to adjust his strategy by keeping $8$, he’d get a winning chance of $1-3\cdot\frac4{25}=\frac{13}{25}$ instead of the current winning chance of $\frac12+\frac1{20}\sum_{k=1}^{10}(k-1)\cdot\frac3{50}+6\cdot\frac1{20}\left(\frac4{25}-\frac3{50}\right)=\frac12+\frac{3\cdot9\cdot10}{2\cdot20\cdot50}+\frac3{100}=\frac12+\frac{27}{200}+\frac3{100}=\frac{133}{200}$. Since $\frac{13}{25}=\frac{104}{200}$ is lower, there is no gain in this strategy change.
On the other hand, keeping $9$ yields a winning probability of $1-2\cdot\frac4{25}=\frac{17}{25}=\frac{136}{200}\gt\frac{133}{200}$, so rerolling $9$ is no gain, either. Thus, $A$’s current strategy of rerolling below $9$ is still the best response to $B$’s new strategy. The overall winning probability of $A$ for this equilibrium strategy pair is
$$
\frac12+\frac1{20}\left(1-\frac4{25}+1-\frac8{25}+8\left(\frac12+\frac1{20}\left(1-\frac4{25}+1-\frac8{25}+\sum_{k=1}^8(k-1)\cdot\frac3{50}\right)\right)\right)
\\
=
\frac12+\frac1{20}\left(\frac{38}{25}+4+\frac25\left(\frac{38}{25}+\frac{7\cdot8\cdot3}{2\cdot50}\right)\right)=\frac{21}{25}=0.84\;,
$$
a slight drop from Ross’ approximation due to the slight improvement in $B$’s strategy.
For your first question (expected value for max value of 5 rolls of a 12-sided die):
Let $X$ be the maximum value.
Then $P(X=k)=\left(\frac{k}{12}\right)^5-\left(\frac{k-1}{12}\right)^5$ for $k=1,2,...,12$.
The reasoning is that for the maximum to be $k$, you need all the rolls to be less than or equal to $k$, but not all less than $k$.
So $$E[X]=\sum_{k=1}^{12}k \left( \left(\frac{k}{12}\right)^5-\left(\frac{k-1}{12}\right)^5\right)=\frac{2604108}{12^5}\approx10.4653$$
Best Answer
If you are trying to maximise the expected score, then since the expected value of the 12-sided die is $6.5$, it makes sense to stop when the 8-sided die shows greater than $6.5$, i.e. when it shows $7$ or $8$, each with probability $\frac18$. So with probability $\frac34$ you throw the 12-sided die.
The expected score is then $$7 \times \frac18 + 8 \times \frac18 + 6.5 \times \frac34 = 6.75.$$