Continuity of Solution to Bellman Equation in discounting factor $\uparrow 1$

dynamic programmingfunctional-analysis

Let $K \subset \mathbb R^n$ be a compact subset and $C(K)$ the set of real-valued continuous functions on $K$, endowed with the sup norm.

Let $g \in C(K)$ and $f:K \to K$ and
$T_\beta:C(K)\to C(K)$ be an operator such that
$$(T_\beta F)(x)=\max_{a \in [0,1]} a g(x) + (1-a) \beta F(f(x))$$

Suppose that

(1) for any $[0,1]$ (yes, inclusive), I know that there is a unique fixed point $F$ to the functional equation $F=T_\beta F$. Call it $F_\beta$, for $\beta \in [0,1]$.

(2) $\forall \beta \in [0,1)$, $T_\beta$ satisfies Blackwell's conditions for $T_\beta$ to be a contraction. Hence, it has a unique fixed point. Call it $G_\beta$, for $\beta \in [0,1)$.

Question:
For any sequence $(\beta_n)_{n\geq 1}$ where $\beta_n \in [0,1)$ and such that $\beta_n \uparrow 1$, is it the case that $G_{\beta_n} \to F_1$?

Best Answer

We take, without future mention,

  • $K$ to be an arbitrary set (not necessarily in $\mathbb{R}^n$),
  • $g$ to be a bounded and nonnegative real-valued map from $K$ (not necessarily continuous),
  • $f$ to be a map from $K$ to itself (not necessarily continuous), and
  • $(\beta_n)_n$ to be a real sequence converging monotonically from below to one.

In particular, no topological structure is imposed on $K$.

Remark. With regards to your original post, I don't see how $T_{\beta}$ is necessarily a map from $C(K)$ to itself since you impose no continuity on $f$. Therefore, in stating the main result below, I use the space of bounded real-valued maps $B(K)$.

Theorem. Suppose the following:

(H1) For each $x$, the supremum $\sup_{k\geq0}[g\circ f^{k}](x)$ is attained at a nonnegative integer $k$.

(H2) The operator $T_{\beta}$ defined by $T_{\beta}F\equiv\max_{0 \leq a \leq 1} ag+(1-a)\beta[F\circ f]$ has a unique fixed point in $B(K)$ for each $\beta$ between zero and one (inclusive).

Denote by $F_{\beta}$ the unique fixed point of $T_{\beta}$. Then, $F_{\beta_{n}}\rightarrow F_{1}$ pointwise.

We prove the above in a series of lemmas.

Conventions. We always use $\beta$ to mean a number between zero and one (inclusive). We take $\inf \emptyset = \infty$. $\Vert \cdot \Vert$ denotes the sup-norm. $f^k$ denotes the composition of $f$ with itself $k$ times.

First, since the maximum of a linear function on an interval is obtained at one of the boundaries, $$ T_{\beta}F=\max_{0\leq a\leq1}ag+\left(1-a\right)\beta[F\circ f]=\max\left\{ g,\beta[F\circ f]\right\} . $$ This suggests that the Bellman equation is the dynamic program associated to a deterministic optimal stopping problem in which the state at time $k$ is $X_{k}\equiv f^{k}(x)$. The controller has to choose a time $k$ to stop at to receive the reward $\beta^{k}g(X_{k})=\beta^{k}[g\circ f^{k}](x)$. You can think of $(X_k)_k$ as a Markov chain with deterministic transitions. Unless there is no discount ($\beta=1$), the controller prefers to stop earlier in order to minimize adverse discounting effects. This is made rigorous below.

Lemma 1 (Characterizing a fixed point). $V_{\beta}\equiv\sup_{k\geq0}\beta^{k}[g\circ f^{k}]$ is a fixed point of $T_{\beta}$.

Proof. Substituting $V_{\beta}$ into $T_{\beta}$, $$ T_{\beta}V_{\beta}=\max\left\{ g,\sup_{k\geq0}\beta^{k+1}[g\circ f^{k+1}]\right\} =\sup_{k\geq0}\beta^{k}[g\circ f^{k}] =V_{\beta}. $$

Lemma 2 (Properties of $V_{\beta}$). $V_{\alpha}\leq V_{\beta}$ whenever $0\leq\alpha\leq\beta$ and $\sup_{\beta}\Vert V_{\beta}\Vert<\infty$.

Proof. The first inequality follows immediately from the definition of $V_{\beta}$ in Lemma 1. As for the second inequality, note that $\beta^{k}\leq1$ and hence $\Vert V_{\beta}\Vert\leq\Vert g\Vert < \infty$.

Corollary 3 (Convergence). The (pointwise) limit $V\equiv\lim_n V_{\beta_{n}}$ exists.

Proof. Apply the monotone convergence theorem pointwise to $V_{\beta_{n}}(x)$.

Next, we define the optimal stopping time

$$ k_{\beta}(x)\equiv\inf\left\{ k\geq0\colon V_{\beta}(x)=\beta^{k}[g\circ f^{k}](x)\right\} . $$ Since $g$ is nonnegative and $\beta<1$, it follows that $$ 0\leq \beta^{k}[g\circ f^{k}](x)\leq\beta^{k}\left\Vert g\right\Vert \rightarrow0\text{ as }k\rightarrow\infty $$ and hence $k_{\beta}<\infty$. However, establishing $k_1 < \infty$ requires assuming (H1).

Lemma 4. $V\leq V_{1}$.

Proof. Let $x$ be arbitrary. Letting $k_{n}\equiv k_{\beta_{n}}$ for brevity, \begin{multline*} V(x) =\lim_{n}V_{\beta_{n}}(x) =\lim_{n}\left(\beta_{n}\right)^{k_{n}(x)}[g\circ f^{k_{n}(x)}](x)\\ \leq\lim_{n}[g\circ f^{k_{n}(x)}](x) \leq\sup_{k\geq0}[g\circ f^{k}](x) =V_{1}(x). \end{multline*}

Lemma 5. Suppose (H1). Then, $V\geq V_{1}$.

Proof. Let $x$ be arbitrary. Then, \begin{multline*} V(x) =\lim_{n}V_{\beta_{n}}(x) =\lim_{n}\sup_{k\geq0}\left(\beta_{n}\right)^{k}[g\circ f^{k}](x)\\ \geq\lim_{n}\left(\beta_{n}\right)^{k_{1}(x)}[g\circ f^{k_{1}(x)}](x) =[g\circ f^{k_{1}(x)}](x) =V_{1}(x). \end{multline*}

We are now ready to prove the theorem.

Proof of the Theorem. Since fixed points are unique by assumption (H2), $V_{\beta}=F_{\beta}$, and the result follows from combining Lemmas 4 and 5.

It is worthwhile, as a closing note, to discuss assumption (H1). Intuitively, (H1) tells us that the controller can always choose a finite optimal stopping time (this does not preclude the existence of infinite optimal stopping times). There are some interesting cases that trivially satisfy (H1), such as (i) $K$ being a finite set or (ii) the image of $K$ under $g$ being a finite set.

Related Question