[Math] Why does randomness work in numerical algorithms

na.numerical-analysis

There are successful numerical algorithms that involves a sequence of random numbers, like Monte Carlo methods or simulated annealing. I can follow the lines of proofs of their convergence, and actually see them work. However I sometimes find it paradoxical that it is useful to deliberately add an unpredictable factor into a process, while we have the opportunity to have the full control over it. Could you give an intuitive explanation for this?

Best Answer

If one had infinite computing capacity, then you would be right that randomness would not be useful (unless an adversary also had access to randomness - see below), and it would always be better to maintain full control of one's parameters. ("God does not need to play dice".) However, our computing capacity is limited, and randomness offers a way to stretch that capacity further (though it is an important open problem to quantify exactly how much randomness actually stretches our capacity; see the discussion on P=BPP in other comments).

Let's give a simple example. Suppose one is playing the following game with a computer. This computer uses some (deterministic) algorithm to generate 1000 "bad" numbers from 1 to 1 million. Meanwhile, you pick your own number from 1 to 1 million. If you pick one of the bad numbers, you lose; otherwise, you win. You get to see the computer's algorithm in advance.

With infinite computing capacity, you have a strategy which is 100% guaranteed to succeed: you run the computer's algorithm, find all the bad numbers, and pick a number not chosen by the computer.

But there is a much lazier strategy that requires no computing power: just pick a number randomly, and one has a 99.9% chance of winning.

[Note also that if the computer used a randomised algorithm instead of a deterministic one, then there is no longer any 100% winning strategy for you, even with infinite computing power, though the random strategy works just fine. This is another indication of the power of randomness.]

Many randomised algorithms work, roughly speaking, by reducing the given problem to some version of the above game. For instance, Monte Carlo integration using a set of points to sample the function might give satisfactory levels of accuracy for all but a tiny fraction of the possible choices of sample points. With infinite computing capacity, one could locate all the bad configurations of sample points and then find a configuration which is guaranteed to give a good level of accuracy, but this is a lot of work - more work, in fact, than just doing classical numerical integration. Instead, one takes the lazy way out and picks the sample points randomly. (Well, in practice, one picks the points pseudorandomly instead of randomly, because true randomness is very hard to capture by a computer; this is a subtle issue - again related to P=BPP - that I don't want to get into here.)

Related Question