To be somewhat nitpicky, I would not quite say that outliers, heteroscedasticity, and non-normality don't matter with robust regression methods. Rather, I would say that robust methods are less likely to be impaired or harmed by those conditions. However, they could still have a negative effect.
The issue of whether the significance of the coefficients or the accuracy of their estimation is what's important is really unrelated to robust regression. Which of those is more important to you depends on the questions you are trying to answer, not what tools you use to try to answer them. Instead, consider a case where you want to test the hypothesis that a given variable is unrelated to the response variable. You wouldn't want the answer you get to that question (either yes or no) to be driven by an outlier. So you would use robust methods to help ensure that your answer is representative of the bulk of your data. Likewise, consider a case where you want to know the slope of the relationship between a predictor variable and the response variable as accurately as possible. You wouldn't want the estimated slope value that you get to have been driven by an outlier. So you would use robust regression to protect against that possibility. In short, robust methods diminish the extent to which your results might be influenced by violations of the classical statistical assumptions.
I recognize your frustration that you did not get any significant results when you used these methods. There are a couple of possibilities here. It may be that what appeared to be the case prior to using robust regression (perhaps the results from a prior OLS regression analysis) were driven by violations of the OLS assumptions and the null hypothesis is actually true. The other possibility is that, when OLS assumptions do hold, standard methods will have more power than robust methods.
Because, assuming normal errors is effectively the same as assuming that large errors do not occur! The normal distribution has so light tails, that errors outside $\pm 3$ standard deviations have very low probability, errors outside of $\pm 6$ standard deviations are effectively impossible. In practice, that assumption is seldom true. When analyzing small, tidy datasets from well designed experiments, this might not matter much, if we do a good analysis of residuals. With data of lesser quality, it might matter much more.
When using likelihood-based (or bayesian) methods, the effect of this normality (as said above, effectively this is the "no large errors"-assumption!) is to make the inference very little robust. The results of the analysis are too heavily influenced by the large errors! This must be so, since assuming "no large errors" forces our methods to interpret the large errors as small errors, and that can only happen by moving the mean value parameter to make all the errors smaller. One way to avoid that is to use so-called "robust methods", see http://web.archive.org/web/20160611192739/http://www.stats.ox.ac.uk/pub/StatMeth/Robust.pdf
But Andrew Gelman will not go for this, since robust methods are usually presented in a highly non-bayesian way. Using t-distributed errors in likelihood/bayesian models is a different way to obtain robust methods, as the $t$-distribution has heavier tails than the normal, so allows for a larger proportion of large errors. The number of degrees of freedom parameter should be fixed in advance, not estimated from the data, since such estimation will destroy the robustness properties of the method (*) (it is also a very difficult problem, the likelihood function for $\nu$, the number degrees of freedom, can be unbounded, leading to very inefficient (even inconsistent) estimators).
If, for instance, you think (are afraid) that as much as 1 in ten observations might be "large errors" (above 3 sd), then you could use a $t$-distribution with 2 degrees of freedom, increasing that number if the proportion of large errors is believed to be smaller.
I should note that what I have said above is for models with independent $t$-distributed errors. There have also been proposals of multivariate $t$-distribution (which is not independent) as error distribution. That propsal is heavily criticized in the paper "The emperor's new clothes: a critique of the multivariate $t$ regression model" by T. S. Breusch, J. C. Robertson and A. H. Welsh, in Statistica Neerlandica (1997) Vol. 51, nr. 3, pp. 269-286, where they show that the multivariate $t$ error distribution is empirically indistinguishable from the normal. But that criticism do not affect the independent $t$ model.
(*) One reference stating this is Venables & Ripley's MASS---Modern Applied Statistics with S (on page 110 in 4th edition).
Best Answer
Consider a set of data with no outlying observations, at a suitable value of the parameters. Now consider moving an observation far into the tail (keeping parameter values and the remaining data constant)
If the density has thin tails, an observation far away is very unlikely (has low relative probability given the parameters), so the chance of seeing it ... and hence the likelihood would be higher if the parameters were moved substantially to accommodate it (there's a limit to how far of course, as the more you move the parameters, the less probable the remainder of the data become.
By contrast a distribution with fat tails doesn't see that observation as unusual at all, and may hardly need to change in response to it.