I don't think median $\pm$ mad is appropriate in general.
You can easily build distributions where 50% of the data are fractionally lower than the median, and 50% of the data are spread out much greater than the median - e.g. (4.9,4.9,4.9,4.9,5,1000000,1000000,100000,1000000). The 5 $\pm$ 0.10 notation seems to suggest that there's some mass around (median + mad ~= 5.10), and that's just not always the case, and you've got no idea that there's a big mass over near 1000000.
Quartiles/quantiles give a much better idea of the distribution at the cost of an extra number - (4.9,5.0,1000000.0). I doubt it's entirely a co-incidence that the skewness is the third moment and that I seem to need three numbers/dimensions to intuitively visualize a skewed distribution.
That said, there's nothing wrong with it per se - I'm just arguing intuitions and readability here. If you're using it for yourself or your team, go crazy. But I think it would confuse a broad audience.
I disagree with the advice as a flat out rule. (It's not common to all books.)
The issues are more subtle.
If you're actually interested in making inference about the population mean, the sample mean is at least an unbiased estimator of it, and has a number of other advantages. In fact, see the Gauss-Markov theorem - it's best linear unbiased.
If your variables are heavily skew, the problem comes with 'linear' - in some situations, all linear estimators may be bad, so the best of them may still be unattractive, so an estimator of the mean which is not-linear may be better, but it would require knowing something (or even quite a lot) about the distribution. We don't always have that luxury.
If you're not necessarily interested in inference relating to a population mean ("what's a typical age?", say or whether there's a more general location shift from one population to another, which might be phrased in terms of any location, or even of a test of one variable being stochastically larger than another), then casting that in terms of the population mean is either not necessary or likely counterproductive (in the last case).
So I think it comes down to thinking about:
what are your actual questions? Is population mean even a good thing to be asking about in this situation?
what is the best way to answer the question given the situation (skewness in this case)? Is using sample means the best approach to answering our questions of interest?
It may be that you have questions not directly about population means, but nevertheless sample means are a good way to look at those questions (estimating the population median of a waiting time that you assume to be distributed as ab exponential random variable, for example is better estimated as a particular fraction of the sample mean) ... or vice versa - the question might be about population means but sample means might not be the best way to answer that question.
Best Answer
For one, distributions more readily have a finite mean than a median or mode. Primary school analyses of these concepts for samples can obscure this issue with random variables. For the median to exist, you either require some value to have a CDF of exactly $\frac12$ (which is far from guaranteed for discrete distributions), or define how we fudge a median from values that get us as close as possible. For exactly one mode to exist, you need a unimodal distribution. For the mean to exist, you only need $\int_{\Bbb R}xdF(x)$ to be finite, with $F$ the CDF. What's more, if you want to define something analogous to $\Bbb E(X-\mu)^2$ with something replacing $\mu:=\Bbb EX$, why stop there? Do we want to define the "median-variance", for example, as the median of $(X-m)^2$, with $m$ the median of $X$? That adds further constraints, hard to fulfill and compute with, on the distributions we can work with. Of course, finite-mean distributions don't have to have finite variance, but you only need slightly lighter tails in the distribution to fix that.
Another issue is we like to analyze the relation between two variables with quantities such as the covariance, which has a wonderful geometric interpretation I've discussed on math.se. How do you modify that concept to use something other than means? Say we keep using $\Bbb E$ as a wrapper, just to avoid another difficult question. If $m_X$ is the median of $X$, do you want to define the covariance as $\Bbb EXY-m_Xm_Y$, or as $\Bbb E(X-m_X)(Y-m_Y)=\Bbb EXY-m_X\mu_Y-\mu_Xm_Y+m_Xm_Y$? These aren't equivalent! In light of the above link, my guess is you'd prefer the second definition. But it's a really strange one, since it uses the means anyway. It'll get even worse if you replace $\Bbb E$ with a median operator or whatever you prefer, since again you won't be able to see it as an inner product any more.
Finally, means & variances just arise more naturally when you do statistical theory:
Having said that, other measures of central tendencies can have some nice properties: $\Bbb E|X-a|^p$ is minimal for $p=2$ if $a=\mu$, but for $p=1$ if $a=m$.