Sequences and Series – Convergence of np(n) Where p(n) = Sum(p(j)/j)

probabilitysequences-and-seriesstochastic-processes

Some years ago I was interested in the following Markov chain
whose state space is the positive integers. The chain begins at state "1",
and from state "n" the chain next jumps to a state uniformly
selected from {n+1,n+2,…,2n}.

As time goes on, this chain goes to infinity, with occasional
large jumps. In any case, the chain is quite unlikely to hit any
particular large n.

If you define p(n) to be the probability that this chain
visits state "n", then p(n) goes to zero like c/n for some
constant c. In fact,

$$ np(n) \to c = {1\over 2\log(2)-1} = 2.588699. \tag1$$

In order to prove this convergence, I recast it as an analytic
problem. Using the Markov property, you can see that the sequence
satisfies:

$$ p(1)=1\quad\mbox{ and }\quad p(n)=\sum_{\lceil n/2\rceil}^{n-1} {p(j)\over j}\mbox{ for }n>1. \tag2$$

For some weeks, using generating functions etc. I tried and failed to find
an analytic proof of the convergence in (1). Finally, at a conference
in 2003 Tom Mountford showed me a (non-trivial) probabilistic proof.

So the result is true, but since then I've
continued to wonder if I missed something obvious. Perhaps there is
a standard technique for showing that (2) implies (1).

Question: Is there a direct (short?, analytic?) proof of (1)?

Perhaps someone who understands sequences better than I do could take a shot at this.

Update: I'm digging through my old notes on this. I now remember that I had a proof (using generating functions) that if $\ np(n)$ converges, then the limit is $1\over{2\log (2)-1}$. It was the convergence that eluded me.

I also found some curiosities like: $\sum_{n=1}^\infty {p(n)\over n(2n+1)}={1\over 2}.$

Another update: Here is the conditional result mentioned above.

As in Qiaochu's answer, define $Q$ to be the generating function of $p(n)/n$, that is,
$Q(t)=\sum_{n=1}^\infty {p(n)\over n} t^n$ for $0\leq t<1$.
Differentiating gives
$$Q^\prime(t)=1+{Q(t)-Q(t^2)\over 1-t}.$$
This is slightly different from Qiaochu's expression because
$p(n)\neq \sum_{j=\lceil n/2\rceil}^{n-1} {p(j)\over j}$ when $n=1$,
so that $p(1)$ has to be treated separately.

Differentiating again and multiplying by $1-t$, we get
$$(1-t)Q^{\prime\prime}(t)=-1+2\left[Q^\prime(t)-t Q^\prime(t^2)\right],$$
that is,
$$(1-t)\sum_{j=0}^\infty (j+1) p(j+2) t^j = -1+2\left[\sum_{j=1}^\infty (jp(j)) {t^j-t^{2j}\over j}\right].$$

Assume that $\lim_n np(n)=c$ exists. Letting $t\to 1$ above the left hand side gives $c$,
while the right hand side is $-1+2c\log(2)$ and hence $c={1\over 2\log(2)-1}$.

Note: $\sum_{j=1}^\infty {t^j-t^{2j}\over j}=\log(1+t).$

New update: (Sept. 2)

Here's an alternative proof of the conditional result that my colleague Terry Gannon
showed me in 2003.

Start with the sum $\sum_{n=2}^{2N}\ p(n)$, substitute the formula in the title,
exchange the variables $j$ and $n$, and rearrange to establish the identity:

$${1\over 2}=\sum_{j=N+1}^{2N} {j-N\over j}\ {p(j)}.$$

If $jp(j)\to c$, then
$1/2=\lim_{N\to\infty} \sum_{j=N+1}^{2N} {j-N\over j^2}\ c=(\log(2)-1/2)\ c,$ so that
$c={1\over 2\log(2)-1}$.

New update: (Sept. 8) Despite the nice answers and interesting discussion below, I am still holding out for an (nice?, short?) analytic proof of convergence. Basic Tauberian theory is allowed 🙂

New update: (Sept 13) I have posted a sketch of the probabilistic proof of convergence under "A fun and frustrating recurrence sequence" in the "Publications" section of my homepage.

Final Update: (Sept 15th) The deadline is approaching, so I have decided to award the bounty to T..
Modulo the details(!), it seems that the probabilistic approach is the most likely to lead to a proof.

My sincere thanks to everyone who worked on the problem, including those who tried it but
didn't post anything.

In a sense, I did get an answer to my question: there doesn't seem to be an easy, or standard
proof to handle this particular sequence.

Best Answer

Update: the following probabilistic argument I had posted earlier shows only that $p(1) + p(2) + \dots + p(n) = (c + o(1)) \log(n)$ and not, as originally claimed, the convergence $np(n) \to c$. Until a complete proof is available [edit: George has provided one in another answer] it is not clear whether $np(n)$ converges or has some oscillation, and at the moment there is evidence in both directions. Log-periodic or other slow oscillation is known to occur in some problems where the recursion accesses many previous terms. Actually, everything I can calculate about $np(n)$ is consistent with, and in some ways suggestive of, log-periodic fluctuations, with convergence being the special case where the bounds could somehow be strengthened and the fluctuation scale thus squeezed down to zero.


$p(n) \sim c/n$ is [edit: only in average] equivalent to $p(1) + p(2) + \dots + p(n)$ being asymptotic to $c \log(n)$. The sum up to $p(n)$ is the expected time the walk spends in the interval [1,n]. For this quantity there is a simple probabilistic argument that explains (and can rigorously demonstrate) the asymptotics.

This Markov chain is a discrete approximation to a log-normal random walk. If $X$ is the position of the particle, $\log X$ behaves like a simple random walk with steps $\mu \pm \sigma$ where $\mu = 2 \log 2 - 1 = 1/c$ and $\sigma^2 = (1- \mu)^2/2$. This is true because the Markov chain is bounded between two easily analyzed random walks with continuous steps.

(Let X be the position of the particle and $n$ the number of steps; the walk starts at X=1, n=1.)

Lower bound walk $L$: at each step, multiply X by a uniform random number in [1,2] and replace n by (n+1). $\log L$ increases, on average, by $\int_1^2 log(t) dt = 2 \log(2) - 1$ at each step.

Upper bound walk $U$: at each step, jump from X to uniform random number in [X+1,2X+1] and replace n by (n+1).

$L$ and $U$ have means and variances that are the same within $O(1/n)$, where the $O()$ constants can be made explicit. Steps of $L$ are i.i.d and steps of $U$ are independent, asymptotically identical-distributed. Thus, the Central Limit theorem shows that $\log X$ after $n$ steps is approximately a Gaussian with mean $n\mu + O(\log n)$ and variance $n\sigma^2 + O(\log n)$.

The number of steps for the particle to escape the interval $[1,t]$ is therefore $({\log t})/\mu$ with fluctuations of size $A \sqrt{\log t}$ having probability that decays rapidly in A (bounded by $|A|^p \exp(-qA^2)$ for suitable constants). Thus, the sum p(1) + p(2) + ... p(n) is asymptotically equivalent to $(\log n)/(2\log (2)-1)$.

Maybe this is equivalent to the 2003 argument from the conference. If the goal is to get a proof from the generating function, it suggests that dividing by $(1-x)$ may be useful for smoothing the p(n)'s.

Related Question