Solved – Halton sequence vs Sobol’ sequence

quasi-monte-carlosamplingsmall-sample

From an answer in a previous question, I was directed toward the Halton sequence, for creating a set of vectors that covered a uniform sample space fairly evenly. But the wikipedia page mentions that higher primes especially are often highly correlated early in the series. This seems to be the case for any pair of high primes with a relatively short sample size – and even when the variables aren't correlated, the sample space isn't evenly sampled, rather there are diagonal bands of high sample density across the space.

Because I'm using vectors of length 6 or more, I will inevitably have to use some primes for which this is a problem (although not as bad as in the example aboved), and some pairs of variables will be non-uniformly sampled in their sample plane. Using the Sobol' sequence to generate a similar set seems to me (just from looking at graphs) to generate samples between pairs of variables that are much more evenly distributed, even for relatively small numbers of samples. This seems much more useful, and so I'm wondering when a Halton sequence would be more beneficial? Or is it just the Halton sequence is easier to compute?

Note: discussion of other multi-dimensional low-discrepancy sequences is also welcome.

Best Answer

Yes, Halton is easier to calculate, but it has the problems you mentioned. Halton can be improved by the leaped Halton method, but it will be not really better than Sobol. For high dimensions (like $d>10$) and moderate count (like $N$ around 500) all methods will run into problems, e.g. some 2D projections in Sobol will look strange, showing strong patterns, not diagonal but more like a chessboard! One way to improve is randomization and e.g. the so-called tent transformation.

Related Question