Control Theory Optimization – Are Inf and Lim Interchangeable in Infinite-Horizon LQR?

control theoryoptimal controloptimizationsequences-and-series

My question is about the infinite-horizon linear-quadratic regulator (LQR) problem. Consider the following linear and time-invariant system
$$
x_{k+1} = Ax_k + Bu_k
$$
Given the initial state vector $x_0 \in \mathbb R^n$, our objective is to solve the following optimization problem
$$
\inf_{f_0,f_1,\dots} \left[\lim_{N \to \infty} \sum_{k=0}^{N-1} x_k^TQx_k + u_k^TRu_k\right] \tag{1}
$$
where $(f_n)_{n \in \mathbb N}$ is the sequence of policies such that $u_k = f_k(x_k)$. In the notes and books that I've read, I noticed that the following optimization problem is solved instead:
$$
\lim_{N \to \infty} \left[\inf_{f_0,\dots,f_{N-1}} \sum_{k=0}^{N-1} x_k^TQx_k + u_k^TRu_k\right] \tag{2}
$$
where the $\inf$ and $\lim$ are interchanged. More specifically, I've found that authors tend to take the limit of the discrete-time finite horizon Riccati difference equation, which suggests that they implicitly performed the interchange shown in $(2)$. However, I'm not sure why this interchange is valid, even when we assume that $(A,B)$ is stabilizable (we assume that $y_k = x_k$ for simplicity, such that $(A,C)$ is observable). That is, why is the solution of the problem in $(1)$ equivalent to the solution of the problem in $(2)$? I'd appreciate any guidance on this.

Best Answer

Consider the following optimal control problem

$$ J_T(x_0)=\min_{u}\sum_{i=0}^{T-1}x_i^TQx_i+u_i^TRu_i $$ where $x_{k+1}=Ax_k+Bu_k$. From dynamic programming, we have that

$$ J_T(x_0)=x_0^TP_T(0)x_0 $$ where $P_T(k)$ solves the Difference Riccati Equation (DRE)

$$ P_T(k-1)=\mathcal{R}(P_T(k)),\ P_T(T)=0. $$

In that case, the optimal control law is given by $u_k=K_kx_k$ where $K_k=-(R+B^TP_T(k+1)B)^{-1}B^TP_T(k+1)A$.

On the other, we can define the cost

$$ J_\infty(x_0)=\min_{u}\sum_{i=0}^{ \infty}x_i^TQx_i+u_i^TRu_i. $$

and we know that this cost is equal to

$$ J_\infty(x_0)=x_0^TP_\infty x_0 $$ where $P_\infty$ is the stabilizing solution of the Algebraic Riccati Equation (ARE)

$$ P_\infty=\mathcal{R}(P_\infty). $$

By stabilizing solution, it is meant here that $A+BK$ is Schur stable where $K=-(R+B^TP_\infty B)^{-1}B^TP_\infty A$.

The following still contains gap and unclear points which I will fix when I understand how to do it correctly. In the meantime that can just serve as an indicative answer. We show that $P_N(0)\to P_\infty$ as $T\to\infty$.

First of all, it is important to note that $P_{T+1}(0)\succeq P_T(0)$ because the integrand is nonnegative. Assumming now that the pair $(A,B)$ is stabilizable, there exists a $K$ such that $A+BK$ is Schur stable.

This means that we have $$ J_T(x_0)\le\min_{u}\sum_{i=0}^{T-1}x_i^T(Q+K^TRK)x_i=x_0^T\left(\sum_{i=0}^{T-1}((A+BK)^i)^T(Q+K^TRK)A+BK)^i\right) $$ and we have that the sum is converging as $T\to\infty$ to, say, $Q$. Therefore, this implies that $P_T(0)\preceq Q$. As a result, the sequence $\{P_T(0)\}_T$ is nondcreasing and upper bounded, which means that it is converging to a limit $P$, which is assumed to be positive definite.

Since it is the solution of a sequence, then it must be converging to a stationary solution of the DRE. In fact, since it is positive definite, it is the unique positive definite solution to the ARE and we have that $P=P_\infty$ (this needs to be better supported).

Best Answer

Related Solutions

Reformulation of optimization problem using kkt and lagrange conditions

Converse linear quadratic optimal control

Related Question