$\lim_{n \to \infty}\frac{1}{n}\sum_{k = 1}^nf(k)$ where $f(n)$ is the largest prime factor exponent

analytic-number-theoryelementary-number-theorylimitsnumber theoryprime factorization

Let $f(n)$ the be largest exponent among exponents of the prime factor of $n$. E.g. $f(80) = 4$ since $80 = 2^4.5$ and the prime factor of $80$ which has the largest exponent is $2$ which occurs with the exponent $4$. Trivially for all square-free numbers $n, f(n) = 1$. Experimental data for $n \le 3.5 \times 10^9$ shows that.

$$
\mu = \lim_{n \to \infty}\frac{1}{n}\sum_{k = 1}^nf(n) \approx 1.784744
$$

Q 1: Does this limit exist and does it have a closed form?

If all positive integers were square-free, the above mean would have been exactly $1$. But this is not the case because there are non-squarefree numbers. Thus, we can say that the square-free numbers contribute exactly $1$ to the above mean value while all numbers which have prime factors with an exponent $> 1$ contribute the remaining $0.784744$.

Using the natural density of $k$-th power free numbers, I can show that

$$
\mu \ge \sum_{k = 1}^{\infty}k \Big(\frac{1}{\zeta(k+1)} – \frac{1}{\zeta(k)}\Big) \approx 1.70521
$$

Q 2: Is it possible to show a weaker result like $\mu < 2$?

Best Answer

The limit exists, and equals the constant $1.70521...$ by which you've lower bounded it. To show this, we first need a claim which lets us evaluate the sum of $f(k)$.

Claim. Let $s_b(n)$ be the sum of the digits of $n$ when written in base $b$, and let $$g_b(n)=\frac{n-s_b(n)}{b-1}=\sum_{k=1}^\infty \left\lfloor\frac{n}{b^k}\right\rfloor.$$ Then, if $\mu$ is the Möbius function, $$\sum_{k=1}^nf(k)=\sum_{q=2}^n(-\mu(q))g_q(n).$$

Proof. For squarefree $q$, define $\nu_q(n)$ to be the largest power of $q$ that divides $n$. Then, since $$\min(\nu_{p_1}(k),\dots,\nu_{p_s}(k))=\nu_{p_1\cdots p_s}(k),$$ the inclusion-exclusion principle gives $$f(k)=\max_{p\text{ prime}}(\nu_p(k))=\sum_{\substack{q\text{ squarefree}\\q\neq 1}}(-\mu(q))\nu_q(k).$$ The sum needs only run up through $k$, since all other terms are $0$ (that is, only finitely many primes need to be considered). Now, the observation that $$g_q(n)=\sum_{k=1}^\infty \left\lfloor\frac{n}{q^k}\right\rfloor=\sum_{k=1}^\infty \sum_{\substack{1\leq m\leq n\\q^k\mid m}}1=\sum_{m=1}^n\nu_q(m)$$ gives the desired result.


We will also need the following technical lemma.

Lemma. There exists some absolute constant $c$ for which, if $n$ and $k$ are positive integers with $n$ sufficiently large and $k\leq n^{0.1}-1$, then $$\left|\sum_{\frac{n}{k+1}<q\leq\frac nk}\frac{\mu(q)s_q(n)}{q-1}\right|\leq \frac{cn}{k(k+1)(\log n)^{1/4}}.$$

Proof. The main idea is that $s_q(n)/(q-1)$ changes relatively slowly in $q$ for $q$ somewhat close to $n$. With this, we'll use the following result of Matomäki and Terävainen (the strength of their result is the fact that $0.55$ is small, while we'll only need that it is some fixed constant less than $1$, but this is the first reference I could find):

Fix $\theta>0.55$. For any $\epsilon>0$, and for $x$ large and $H>x^\theta$, $$\left|\sum_{x\leq n\leq x+H}\mu(x)\right|=O\left(\frac H{(\log x)^{1/3-\epsilon}}\right).$$

This allows us to save a log factor on sums of the Möbius function multiplied by some constant. If we split up our sum carefully, we can save this log factor overall.

Note that, since $k\leq \sqrt n-1$, all $q$ in the sum are between $\sqrt n$ and $n$, so $n$ has length $2$ when written in base $q$. The first "digit" is $k$, and the second is $n-kq$. Let $L=\frac{n}{k(k+1)}$, and fix $N=\lfloor n^{0.2}\rfloor $. Now, write $$\sum_{\frac{n}{k+1}<q\leq\frac nk}\frac{\mu(q)s_q(n)}{q-1}=\sum_{t=1}^N\sum_{\frac{n}{k+1}+\frac{(t-1)L}N<q\leq \frac n{k+1}+\frac{tL}N}\frac{\mu(q)(k+n-kq)}{q-1}.$$ We'll bound the inside sum for fixed $t$. Write $q_1=q_1(t)=\frac{n}{k+1}+\frac{(t-1)L}{N}$ and $q_2=q_2(t)=\frac n{k+1}+\frac{tL}{N}$. Note that, for $q_1<q\leq q_2$, \begin{align*} \left|\frac{k+n-kq_2}{q_2-1}-\frac{k+n-kq}{q-1}\right|&\leq \left|\frac{n}{q_2-1}-\frac n{q-1}\right|\\ &\leq \frac{n(q_2-q_1)}{(q_2-1)(q_1-1)}\leq \frac{n(L/N)}{\left(\frac n{k+1}-1\right)^2}\leq \frac{4(k+1)^2L}{Nn} \end{align*} (the constant $4$ is just for safety; the bound nearly holds if the $4$ is removed, but the $-1$ terms in the denominator mess it up slightly). This means \begin{align*} \left|\left(\sum_{q_1<q\leq q_2}\frac{\mu(q)s_q(n)}{q-1}\right)-\left(\left(\frac{n}{q_2-1}-k\right)\sum_{q_1<q\leq q_2}\mu(q)\right)\right| &\leq \sum_{q_1<q\leq q_2}\frac{|\mu(q)|4(k+1)^2L}{Nn}\\ &\leq \frac{4(k+1)^2L^2}{N^2n}=\frac{4n}{N^2k^2}.\tag{1} \end{align*} Now, since $q_2-q_1=L/N\geq n^{0.6}$ and $n^{0.9}\leq q_1<n$, we can apply the result of Matomäki and Terävainen with $\theta=0.6$ and $\epsilon=1/12$ to get that there exists some constant $c_0$ for which $$\left|\sum_{q_1<q\leq q_2}\mu(q)\right|\leq \frac{c_0(L/N)}{(\log q_1)^{1/4}}.$$ From this, since $q_1\geq n^{0.9}$, there is some constant $c_1$ for which (1) gives $$\left|\sum_{q_1<q\leq q_2}\frac{\mu(q)s_q(n)}{q-1}\right|\leq \frac{4n}{N^2k^2}+\frac{c_1n}{k(k+1)N(\log n)^{1/4}}.$$ Summing over all $N$ values of $t$ gives $$\left|\sum_{\frac{n}{k+1}<q\leq \frac nk}\frac{\mu(q)s_q(n)}{q-1}\right|\leq \frac{4n}{Nk^2}+\frac{c_1n}{k(k+1)(\log n)^{1/4}}.$$ The fact that $4/N\ll (\log n)^{-1/4}$ finishes the proof. $\square$


We now use our claim and lemma to isolate the "essential part" of the sum of $f(k)$. By the definition of $g_q$, $$\left|\left(\sum_{q=2}^n(-\mu(q))g_q(n)\right)-n\left(\sum_{q=2}^n\frac{-\mu(q)}{q-1}\right)\right|\leq \left|\sum_{q=2}^n\frac{\mu(q)s_q(n)}{q-1}\right|.$$ We bound the right side using our lemma. First, since $s_q(n)$ is the sum of at most $\log_qn+1$ digits in base $q$, we have $s_q(n)/(q-1)\leq \log_qn+1$, and so, if $M=\lceil\log^2 n\rceil$, $$\left|\sum_{2\leq q\leq \frac nM}\frac{\mu(q)s_q(n)}{q-1}\right|\leq \sum_{2\leq q\leq \frac nM}\left(\log_qn+1\right)\leq \frac{n(\log_2 n+1)}{M}=O\left(\frac n{\log n}\right).$$ The remaining sum can be written as $$\sum_{k=1}^{M-1}\sum_{\frac n{k+1}<q\leq \frac nk}\frac{\mu(q)s_q(n)}{q-1}\leq \sum_{k=1}^{M-1}\left|\sum_{\frac n{k+1}<q\leq \frac nk}\frac{\mu(q)s_q(n)}{q-1}\right|.$$ Each $1\leq k\leq M-1$ satisfies the conditions of the lemma, so there exists some constant $c$ for which $$\left|\sum_{\frac nM<q\leq n}\frac{\mu(q)s_q(n)}{q-1}\right|\leq \sum_{k=1}^{M-1}\frac{cn}{k(k+1)(\log n)^{1/4}}<\frac{cn}{(\log n)^{1/4}}=O\left(\frac n{(\log n)^{1/4}}\right).$$ So, in total, we have $$\boxed{\frac1n\sum_{k=1}^nf(k)=\left(\sum_{q=2}^n\frac{-\mu(q)}{q-1}\right)+O\left(\log^{-1/4}n\right)}.$$


We now finish by investigating the sum $$r_n=\sum_{q=2}^n \frac{(-\mu(q))}{q-1}.$$ Define for $k\geq 1$ the sum $t_{n,k}=\sum_{q=2}^n\frac{(-\mu(q))}{q^k}$; then $$r_n=\sum_{k=1}^\infty t_{n,k}.$$ It is known that $$1-t_{n,k}=\sum_{q=1}^n\frac{\mu(q)}{q^k}$$ converges as $n\to\infty$ (even for $k=1$) to $1/\zeta(k)$ (or $0$ if $k=1$), with $$\left|t_{n,k}-\left(1-\frac1{\zeta(k)}\right)\right|\leq \sum_{q=n+1}^\infty \frac{1}{q^k}\leq \sum_{q=n+1}^\infty \frac{1}{(q-1)^{k-1}}-\frac1{q^{k-1}}=\frac1{n^{k-1}}$$ for $k\geq 2$, and so $$\left|r_n-\left(t_{n,1}+\sum_{k=2}^\infty\left(1-\frac1{\zeta(k)}\right)\right)\right|\leq \sum_{k=2}^\infty \frac1{n^{k-1}}=\frac1{n-1}.$$ So, since $t_{n,1}\to 1$ as $n\to\infty$, this shows that $$r_n\to 1+\sum_{k=2}^\infty \left(1-\frac1{\zeta(k)}\right)\approx 1.70521.$$


Letting $\mu_0$ be this sum, $\mu_0$ is the limit of $r_n$, so we conclude that $$\frac1n\sum_{k=1}^n f(k)\to \mu_0.$$

Related Question