Discrete Time LQR and Lyapunov Equations

control theorydiscrete timeoptimal control

Consider the following discrete time LQR system, where the dynamics are dictated by
$$x_{t+1} = A x_t + B u_t.$$
and costs are $$C(K) = \sum_t x_t^\top Q x_t + u_t^\top R u_t.$$ For simplicity assume the starting state $x_0$ has $E(x_0^\top x_0) = I$.

Given a controller $K$ which stabilizes the system, whereby $u_t = K x_t$ it is easy to show via properties of the quadratic LQR cost that $$C(K) = Tr(P_K)$$ where $P_K$ solves discrete Lyapunov equation with $(A+BK)^\top, Q + K^\top RK$. However, I have also seen it mentioned that
$$C(K) = Tr((Q + K^\top R K) P_K)$$
where $P_K$ solves the discrete Lyapunov equation with $(A+BK, I)$.

How is this second equivalent representation of the LQR cost derived? Does this hold due to a relationship between solutions to the discrete Lyapunov equation?

Best Answer

We have that

$$ \begin{array}{rcl} C(K)&=&\mathbb{E}\left[\sum_tx_t^TQx_T+u_t^TRu_t\right]\\ &=&\sum_t\mathbb{E}\left[x_t^T(Q+K^TRK)x_t\right]\\ &=&\mathrm{trace}\left[(Q+K^TRK)\mathbb{E}\left[\sum_t x_tx_t^T\right]\right]\\ &=&\mathrm{trace}((Q+K^TRK)P). \end{array} $$ where we have set $P:=\mathbb{E}\left[\sum_tx_tx_t^T\right]$.

If we let $S_t:=\mathbb{E}[x_tx_t^T|S_{t-1}]$, then we have that

$$ \begin{array}{rcl} S_{t+1}&=&\mathbb{E}[x_{t+1}x_{t+1}^T|S_{t}]$\\ &=&(A+BK)S_t(A+BK)^T \end{array} $$ and $S_0=\mathbb{E}[x_0x_0^T]=I$. So, we have that

$$ S_t=(A+BK)^t((A+BK)^T)^t. $$

It can be shown that $P$ exists if and only if $A+BK$ is Schur stable, which is the case here. This yields

$$ P=\sum_{t=0}^\infty(A+BK)^t((A+BK)^T)^t. $$ We can play around this expression as $$ \begin{array}{rcl} P &=& I+\sum_{t=1}^\infty(A+BK)^t((A+BK)^T)^t\\ &=& I+(A+BK)\left[\sum_{t=1}^\infty(A+BK)^{t-1}((A+BK)^T)^{t-1}\right](A+BK)^T\\ &=& I+(A+BK)\left[\sum_{t=0}^\infty(A+BK)^{t}((A+BK)^T)^{t}\right](A+BK)^T\\ P &=& I+(A+BK)P(A+BK)^T \end{array} $$ to get the desired expression $$ (A+BK)P(A+BK)^T-P+I=0. $$