How Michaud Resampling Enhances Mean-Variance Optimization Techniques

errorestimationmeanresamplingvariance

Michaud Resampling claims to reduce estimation error through the following process:

Step 1. Sample a mean vector and covariance matrix of returns from distribution of both
centered at the original (point estimate) values normally used in MV
optimization.

Step 2. Calculate an MV efficient frontier based on these sampled risk and return estimates.

Step 3. Repeat steps 1 and 2 (until enough observations are available for convergence in
step 4).

Step 4. Average the portfolio weights from step 2 to form the RE optimal portfolio.

Taken directly from the following paperMichaud, R. O., & Michaud, R. (2007). Estimation error and portfolio optimization: A Resampling Solution. SSRN Electronic Journal.

I understand the process but have trouble understanding why this would lead to a different result than simply using the point estimates for mean and variance under regular Mean-Variance Optimization, let me elaborate as to why.

I would assume that the more observations that are taken using this method, the more accurate the result should be. I know that the convention is to take at least thousands of observations (steps 1 & 2) until proceeding to average all of these observations to form the optimal portfolio (step 4). What I cannot seem to understand is that if we are taking these observations from a probability distribution using our point estimates as the mean of those distributions, then wouldn't the average of all those observations simply be the point estimate itself?

As an example, traditional MVO would use a point estimate for Mean return. If we were to make a distribution around the Mean return using our point-estimate as the center and then take a large number of random observations from that distribution, wouldn't averaging them together simply yield the original point-estimate since it was the mean in the first place? The same logic would apply to the point-estimate of Variance.

I am sure that I am missing something here and my guess is that the answer lies somewhere in the mathematics of probability distributions and how the two separate distributions (for mean and variance) interact with each other. But my math skills aren't good enough to dig for the answer myself so I was hoping one of the fine members of this community could explain it to me.

Best Answer

You are not using the sampled values at each iteration to calculate the mean of the distribution, you are using them to calculate a sample of the optimal portfolio weights. At the end of, say, $N$ iterations, you have $N$ samples of optimal portfolio weights, and you form your optimal portfolio by averaging them. Each sample optimal portfolio weight vector is a nonlinear transformation of the sampled parameters, and as a consequence, the averaged sample optimal portfolio weights will (almost certainly) be different than the ones you would have calculated using the original estimated parameter values.

Related Question