Convergence of neighboring points implies convergence of the sequence

convergence-divergencelimitsreal-analysis

I am reading "Computing a Trust Region Step" paper where Lemma 4.10 states the following:

Let $x^*$ be an isolated limit point of a sequence $(x_k)$ in $\mathbb{R}^n$. If $(x_k)$ does not converge, then there is a subsequence $(x_{l_j})$ which converges to $x^*$ and an $\epsilon > 0$ such that $\|x_{l_j+1}-x_{l_j}\| \geq \epsilon$. (Contrapositive of this statement means convergence of neighboring points implies convergence of the sequence.)

My question:
I cannot follow the proof so I will be writing my understanding of the given proof in the paper. Can you please clarify my confusions?

1- $x^*$ is an isolated accumulation point, therefore, there is a ball $B(x^*, \epsilon)$ such that $x^*$ is the unique accumulation point in the ball.

2- $x^*$ is an accumulation point, then there exists a subsequence $(x_{k_j}) \to x^*$.

3- Since $(x_{k_j}) \to x^*$, then there exists $k^*$ such that for all $k_j \geq k^*$ one has $\|x_{k_j}-x^*\| < \epsilon$.

4- For all $k_j \geq k^*$, there exits $r > k_j$ such that $\|x_r -x^*\| \geq \epsilon$. Otherwise $\|x_r -x^*\| < \epsilon$ for all $r > k_j$ which means $(x_{k}) \to x^*$ that contradicts our assumption.

5- Let $l_j=\max\{r-1: \|x_i-x^*\| < \epsilon, i=k_j, \dots, r-1 \}$.

6- Therefore, $\|x_{l_j}-x^*\| < \epsilon$ for all $l_j$ which implies $(x_{l_j}) \to x^*$.

Confusions:

1- I step 1- I do not understand why the authors has closed ball?

2- In step 3- I cannot understand why the authors have equality instead of inequality?

3- After step 6- I cannot see why $\|x_{l_j+1}-x^*\| > \epsilon $ because its definition is the same as $x_{l_j}$?

2- I cannot complete the proof from this point on.

Note:

Please refer to my steps and clarify which part I am missing or misunderstanding.

Best Answer

I will refer to your understanding and your questions, but let me share my understanding of the result. It may help to compare our understandings! But, if you want to skip this bit, I understand. I'll deal with your understanding below the horizontal line.

Of course (and you seem to have no problem with this, as you do not question step 2), a limit point of a sequence always has a subsequence converge to it. In this case, however, we have an isolated limit point, and a non-convergent sequence, so we can say a little bit more.

It means that there's a ball, centred at our limit $x^*$, with radius some $\varepsilon > 0$ such that no limit point exists in this ball except $x^*$. There can (and will) be infinitely many sequence points inside this ball, but arranging all these points into a subsequence will produce something converging to $x^*$.

Why? Because if infinitely many of them stayed away some distance $\delta > 0$ (where $\delta < \varepsilon$), then they would be contained the compact set $B[x^*; \varepsilon] \setminus B(x^*; \delta)$, i.e. a closed ball with a concentric open ball punched out (note: I use round brackets for open balls, and square for closed). As such, there would be a subsequence converging in this no-man's-land ball $B[x^*; \varepsilon]$ to some point other than $x^*$. That would be a contradiction.

So, we can take the subsequence in this ball $B[x^*; \varepsilon]$ and it must converge to $x^*$. But, these can't be all, or even all but finitely many points in the sequence $x_n$, because otherwise $x_n \to x^*$ (and $x_n$ is assumed not to be convergent). So, there must be infinitely many points of the sequence outside the ball $B[x^*; \varepsilon]$ too!

Now, because there are both infinitely many points inside the ball, and infinitely many outside, the points have to "interlace" in the sequence in some way. Maybe you'll get two points inside the ball, followed by four outside, then one inside, then ten outside, etc, etc. That is to say, there will be alternating "blocks" of points inside or outside the ball. Note: you'll never get to the point where you will "run out" of points inside or outside the ball! If you did, there would only be finitely many, which would either prove $x^*$ is not a limit point, or that $x_n \to x^*$, both of which would be a contradiction.

We build our subsequence $x_{l_j}$ by taking the last sequence point in each of our (infinitely many) blocks of points inside $B[x^*; \varepsilon]$. That way, we form a subsequence of the subsequence of points inside the ball which, as we discussed earlier, converges to $x^*$. The very next point will be outside the ball, which will maintain a fixed distance of at least $\varepsilon$ from this limit point. So, as $j \to \infty$, our choice of $x_{l_j}$ will approach $x^*$, getting further from the points outside $B[x^*; \varepsilon]$. A simple limit argument shows that $x_{l_j}$ must eventually become at least distance $\varepsilon / 2$ from any sequence point outside the ball, and since we chose it to be the last one of the "block", the very next sequence point $x_{l_j + 1}$ is an example of such a sequence point.

Anyway, that's my understanding of the result. Let's tackle yours.


1- In step 1- I do not understand why the authors has closed ball?

There's no reason why you can't have a closed ball. If no other limit points exist in an open ball $B(x^*; \varepsilon)$, then the same can be said about, say, the closed ball $B[x^*; \varepsilon/2] \subseteq B(x^*; \varepsilon)$. Going from closed ball to open ball is even easier: you get to keep the same radius. Logically, using open and closed balls is equivalent.

However, there is an actual advantage in using a closed ball over an open ball, which is compactness. As I reasoned from my understanding, it is helpful to have this neighbourhood (that's devoid of other limit points) be compact. It means that we can more successfully argue by contradiction that the sequence points in this neighbourhood must converge to $x^*$, otherwise we will just be worrying that the badly behaved points not approaching $x^*$ might just limit to a point on the sphere (which may or may not be limit point).

2- In step 3- I cannot understand why the authors have equality instead of inequality?

As before, strict-vs-non-strict (or strong-vs-weak) inequality is not a big deal one way or another. If $\|x_{k_j} - x^*\| < \varepsilon$ holds, then certainly $\|x_{k_j} - x^*\| \le \varepsilon$ holds too. If you can guarantee that $\|x_{k_j} - x^*\| \le \varepsilon$ holds, where $\varepsilon > 0$ is arbitrary, then you can ensure $\|x_{k_j} - x^*\| \le \varepsilon/2$, and under these conditions, $\|x_{k_j} - x^*\| < \varepsilon$.

Also as before, using non-strict inequality places the $x_{k_j}$s in the compact neighbourhood $B[x^*; \varepsilon]$, and as we previously discussed, compactness is a helpful property to push the proof through.

3- After step 6- I cannot see why $\|x_{l_j+1}-x^*\| > \varepsilon$ because its definition is the same as $x_{l_j}$?

The definition of $x_{l_j}$ is defined to be, as I talked about above, the last in a "block" of points inside $B[x; \varepsilon]$. It means that the very next sequence term $x_{l_j + 1}$ will be outside $B[x; \varepsilon]$, or else it would be part of the "block"!

To be precise, the definition of $l_j$ states: $$l_j = \max\{l : \|x_i - x^*\| \le \varepsilon, i = k_j, \ldots, l\}.$$ If $\|x_{l_j + 1} - x^*\| \le \varepsilon$, then we would have $\|x_i - x^*\|$ for $i = k_j, \ldots, l_j, l_j + 1$, which would put $$l_j + 1 \in \{l : \|x_i - x^*\| \le \varepsilon, i = k_j, \ldots, l\}.$$ This contradicts $l_j$ being the maximum of the preceding set!

(As an aside: there is small hole in this proof here. The "subsequence" they define will, in general, repeat terms from the sequence. Once you throw away duplicate terms, you will get a suitable subsequence, however, so I won't go too much into it.)

2- I cannot complete the proof from this point on.

It follows in similar fashion as to how I explained above. Because $x_{l_j}$ lies inside this compact ball $B[x^*; \varepsilon]$, all of its limit points lie inside $B[x^*; \varepsilon]$, but there's only one possible limit point: $x^*$. So, the sequence converges to $x^*$.

Now, by definition of the limit $x_{l_j} \to x^*$ (taking the $\varepsilon$ from the definition to be half the $\varepsilon$ that we have been dealing with up to now), we know that some $k$ exists such that $$j \ge k \implies \|x_{l_j} - x^*\| \le \frac{\varepsilon}{2}.$$ The rest is reverse triangle inequality: \begin{align*} \|x_{l_j + 1} - x_{l_j}\| &= \|(x_{l_j + 1} - x^*) - (x_{l_j} - x^*)\| \\ &\ge \|x_{l_j + 1} - x^*\| - \|x_{l_j} - x^*\| \\ &\ge \varepsilon - \frac{\varepsilon}{2} = \frac{\varepsilon}{2}. \end{align*} So, to sum up, $x_{l_j} \to x^*$, and $x_{l_j+1}$ always stays at least a fixed distance $\frac{\varepsilon}{2}$ from $x_{l_j}$ at all times.