[Math] Mertens’ function in time $O(\sqrt x)$

algorithmscomputational-number-theorynt.number-theory

This MathOverflow question seems to indicate that the state of the art in computing
$$
M(x)=\sum_{n\le x}\mu(n)
$$
takes time $\Theta(n^{2/3}(\log\log n)^{1/3}),$ which matches my understanding. Recently I came across a paper [1] which gives, almost as an afterthought, an algorithm for computing $M(x)$ in $O(\sqrt x)$ time (in section 3.2).

The algorithm itself seems to be correct, being derived in a straightforward way from the identity
$$
M(x)=1-\sum_{d\ge2}M(x/d).
$$
But the claim that is runs in time $O(\sqrt x),$ or even $O(x^{1/2+\varepsilon}),$ seems unusual enough for me to ask for verification.

First, this would be a major breakthrough in computing $M(x),$ enough that I would think it would merit more mention than a substep of another algorithm. At the least, I would expect a mention that previous algorithms were slower.

Second, the algorithm is very simple, so that if the time is as indicated the implied constant should be low and the algorithm should be practical. In any case it is not hard to program.

Third, on coding the algorithm I found it to be very slow. In fact, it was slower for those values tested than sum(n=1, 10^7, moebius(n)) in GP which involves factoring each number up to $10^7.$ The time needed to factor numbers of those sizes is about $\sqrt x/\log x$ on average, so that's a $\Theta(n^{3/2}/\log n)$ algorithm (admittedly, well-optimized) beating a $\Theta(n^{1/2})$ algorithm. The constants would have to be worse by a factor of $5\cdot10^6$ for that to happen, and there's nothing in the algorithm to suggest anything that bad.

Of course I may have miscoded it (though I obtained the correct answer) or there may be reasons why the constant factors would be so high for this algorithm. But in any case this seemed suspicious enough to bring up here that I might be enlightened in any case.

Of course even if the result claimed is not correct this is no mark against the author, as the paper is only a preprint and the claim is peripheral in any case.

[1]: Jakub Pawlewicz, Counting square-free numbers (2011), arXiv:1107.4890.

Best Answer

This is really a response to Dror Speiser's comment (but it is too long to give as a comment), and gives an analytic $O(n^{1/3+\epsilon})$ time and $O(n^{1/6+\epsilon})$ space algorithm to count the number of square free numbers less than $x$ (thus improving on the complexity of the algorithm given in the Jakub Pawlewicz paper cited in the question). If we do exactly in the same way as in my other answer for $$\sum_{n\leq x} |\mu(n)|$$ instead of $$\sum_{n\leq x} \mu(n)$$ by using the fact that $$\frac{\zeta(s)}{\zeta(2s)} = \sum_{n=1}^\infty |\mu(n)| n^{-s}$$ it is true that the time complexity will be $O(x^{1/2+\epsilon})$. There is however a way to combine the analytic method (which is essentially the same as in Lagarias-Odlyzkos 1987 paper "Computing $\pi(x)$ an analytic method" http://www.dtc.umn.edu/~odlyzko/doc/arch/analytic.pi.of.x.pdf) with the elementary identity $$ S(n)=\sum_{d=1}^{\lfloor \sqrt n\rfloor} \mu(d) \lfloor \frac n {d^2} \rfloor $$ that is Pawleviec's starting point in obtaining his algorithm. Let $\Phi$ be defined as in my other answer, i.e. let $\Phi(x)$ be a smooth test function such that $\Phi(x)=1$ for $x<0$ and $\Phi(x)=0$ for $x>1$. Consider the identity $$ \sum_{n\leq x} |\mu(n)| =\sum_{n=1}^\infty |\mu(n)| \Phi \left( \frac {n-x} {x^{2/3}} \right ) - \sum_{x< n < x + x^{2/3} } |\mu(n)| \Phi \left(\frac {n-x} {x^{2/3}} \right)$$

The first sum can be estimated by a complex integral of length $x^{1/3+\epsilon}$ and by the Odlyzko-Schönhage algorithm it can be calculated in $O(x^{1/3+\epsilon})$ time (and $O(x^{1/6+\epsilon})$ space). The remaining sum will have length $O(x^{2/3})$ so might at first glance not be so simple to treat. I claim however that in fact it can be calculated in $O(x^{1/3+\epsilon})$ time. This is where using the elementary identity is handy. Let $\Psi(x)=0$ for $x<0$ and $\Psi(x)=\Phi(x)$ for $x \geq 0$ (this will have a discontinuity at $x=0$) By a similar elementary identity we obtain

$$\sum_{x< n < x + x^{2/3} } |\mu(n)| \Phi \left(\frac {n-x} {x^{2/3}} \right) = \sum_{d=1}^{\lfloor \sqrt {x+x^{2/3}}\rfloor} \mu(d) \sum_{k=1}^\infty \Psi \left(\frac {d^2k-x} {x^{2/3}} \right). $$ For any $d$ the inner sum can be calculated fast (certainly in $O(x^\epsilon)$ time), by either noticing that the sum is empty, just contains one element or the Poisson summation formula. The trick now is to see that there will only be $O(x^{1/3})$ integers $d$ where the inner sum (in $k$) is non empty, namely $d$ must either be $O(x^{1/3})$ or belong to an interval $(\sqrt{ x/k},\sqrt{(x+x^{2/3})/k})$ for $k \leq x^{1/3}$ and there are only $O(x^{1/3})$ such $d$. Now if we can calculate the values of $\mu(d)$ fast (simple sieving seems more difficult in this case) we obtain the desired algorithm. Determining $\mu(d)$ basically comes down to factoring the number. However it is sufficient to determine the factors less than $d^{1/3}$ since if we know that all prime factors are greater than $d^{1/3}$, then either it has one or two prime factors. We can determine if it is a prime fast by the Agrawal-Kayal-Saxena algorithm, then of course $\mu(d)=-1$. If it has two prime factors either $\mu(d)=0$, but then it has to be a square, and it can be easily checked, or $\mu(d)=1$. Now factoring the $O(x^{1/3})$ numbers of size $O(x^{1/2})$ into all prime factors less than $x^{1/6}$ will take $O(x^{1/3+\epsilon})$ time and $O(x^{1/6+\epsilon})$ space by Bernstein's algorithm http://cr.yp.to/papers/sf.pdf. I might possibly write up this answer as a paper proper. If anyone has seen these arguments anywhere else or have references I would be interested.

Related Question