One of the very often reasons people cite for using LQR-based controllers is that they achieve at least \(\text{PM}\geq60^\circ\) and \(1/2<\text{GM}<\infty\). These stability margins do not hold in general. They require
Yet, there are many examples in the literature (TODO: re-find them) where these stability margines are claimed for a digital control law running with an estimated state.
I wish to explore here (i) where these stability margins come from and (ii) what causes them degrade in real-world circumstances. The best reference I have found for most of this is [1].
Let's begin by considering the block diagram 1 and derive the closed loop transfer function from \(R\) to \(Y\). We have
\begin{align} U_o(s) &= R(s) - KX(s)\\ sX(s) &= AX(s) + B[R - KX]\\ &= (A-BK)X \end{align}It follows that, since \(Y = CX(s)\), then
\begin{equation} \frac{Y(s)}{R(s)} = C(sI - A + BK)^{-1}B \label{eqn:hcl_full1} \end{equation}This result is pretty standard in controls texts. What we need though is a form that exposes the loop gain in a characteristic equation so that can analyze \(\rho K\) like in root locus or Nyquist. We can show this two ways. First through a different analysis of the block diagram, and second through manipulation of \eqref{eqn:hcl_full1} using the Woodbury matrix identity.
First, we re-analyze the block diagram. Let \(U_o = R + U_1\), where \(U_1 = KX(s)\). Then from \[ X(s) = (sI - A)^{-1}BU_o(s) \]
we find that the transfer function from \(R\) to \(U_o\) is \[ \frac{U_O(s)}{R(s)} = \frac{1}{1 + K(sI - A)^{-1}}. \]
But \(Y(s) = G(s)U_o(s)\). Thus,
\begin{equation} \frac{Y(s)}{R(s)} = \frac{C(sI - A)^{-1}B}{1 + K(sI - A)^{-1}B}.\label{eqn:hcl_full2} \end{equation}We see then that the closed-loop characteristic equation is \(1+L(s)\), where \(L(s) = K(sI - A)^{-1}B\).
Out of curiosity (and as a bit of a warm up), let's show directly that the two forms are equivalent. Because if you ask me, it's not obvious why \eqref{eqn:hcl_full2} should be the same as \eqref{eqn:hcl_full1}.
Recall the matrix inversion lemma. For invertible matrices \(P\) and \(Q\),
\begin{equation} (P + MQN)^{-1} = P^{-1} - P^{-1}M(Q^{-1} + NP^{-1}M)^{-1}NP^{-1} \label{matinv} \end{equation}Apply \eqref{matinv} to \eqref{eqn:hcl_full1} with \(P = (sI-A)\), \(M=B\), \(Q=1\), and \(N=K\). Then we have
\begin{align} \frac{Y(s)}{R(s)} & = C(sI - A + BK)^{-1}B\\ & = C\left\{ (sI-A)^{-1} - (sI-A)^{-1}B [1+ K(sI-A)^{-1}B ]^{-1} K(sI-A)^{-1} \right\}B\\ % &= G(s) - G(s)[1 + K(sI - A)^{-1}B]^{-1}K(sI - A)^{-1}B\\ &= G(s)\left[1 - [1 + K(sI - A)^{-1}B]^{-1}K(sI - A)^{-1}B\right]\\ % &= G(s)[1 + K(sI - A)^{-1}B]^{-1}\left[ [1 + K(sI - A)^{-1}B] - K(sI - A)^{-1}B\right]\\ &=G(s)[1 + K(sI - A)^{-1}B]^{-1}\\ % &=\frac{G(s)}{1 + K(sI - A)}\\ &=\frac{G(s)}{1 + L(s)} \end{align}which is the same as \eqref\label{eqn:hcl_full2}, as desired.
we begin to derive the stability margins for a continuous-time LQR full state feedback control law. First, we need something called the return difference equality, which is given by [1] (TODO: derive this)
\begin{equation} R + B^{T}(-j\omega I -A^{T})^{-1}Q(j\omega I -A)^{-1}B = [I - B^{T}(-j\omega I -A^{T})^{-1}K^{T}]R[I - K(-j\omega I -A)^{-1}B] \end{equation}In the SISO case when \(Q=C^{T}C\), this can be written as
\begin{equation} R|1 - K^{T}(j\omega I - A)^{-1}B|^{2} = R + |C^{T}(j\omega I -A)^{-1})B|^{2}. \end{equation}Because \(|C^{T}(j\omega I -A)^{-1})B|^{2}\geq0\), we obtain
\begin{equation} |1 - K^{T}(j\omega I - A)^{-1}B|^{2} > 1 \label{eqn:ret_ineq}$ \end{equation}We are now in a position see the claimed stability margins.
The stability margins come from analyzing the nyquist diagram of the loop gain \(L(s)\). The inequality \eqref{eqn:ret_ineq} tells us that the nyquist plot will avoid a circle centered at \(-1+0j\) with radius 1. We know that the LQR controller is stabilizing. This means that \(-1\) has the right number of encirclements of the Nyquist plot to ensure stability and furthermore, that these same number of encirclements will hold for the point \(-1/k\) anywhere between -2 and 0. This gives us the gain margin claim. The phase margin similarly comes by examining the intersection of circles of radius one centered at 0 and -1.
An example of this is illustrated in Figure 2.
The stability margins we just derived are destroyed when we use an observer to estimate the states (which is almost always). The clearest way to illustrate this is with an example. Let's take the same system and feedback matrix \(K\) used to generate Fig. 2 and add an observer. The nyquist plot of the resulting loop gain is shown in Fig. 3. The resulting nyquist is considerably different. But most notably, it makes a serious incursion into the margin bubble.
The trouble is that the loop gain changes and this has been known since the 70's or so [2]. There are examples in the literature where the new PM and GM become arbitrarily small once an observer is added [3] (you really should look this paper up. The abstract is priceless).
To help analyze the situation, we can redraw the standard state space feedback loop into Fig. 4, which is now in a unity feedback configuration.
Let's derive the new loop gain along with the closed-loop transfer function. Denote the state estimate by \(\hat x\) with associated state space matrices \(\hat A\), \(\hat B\), \(\hat C\). The hat indicates that these are our model matrices, which are rarely identically equal to the plant matrices, the nonsensical assumptions of the separation principle notwithstanding. The error between the measured output \(y\) and the estimated output \(\hat y\) is \(y_{e}\). Then
\begin{align} Y(z) &= G(z) u(z)\\ U(z) & = R(z) - K\hat{X}(z) \end{align}As usual, the combined estimator and controller has two inputs: the reference \(R\) and the plant measurement \(Y\), while its output is the control \(U\)
\begin{align} U(z) &= -KX(z) \\ &= -K(zI - \hat{A} + \hat{B}K - L\hat{C})^{-1} ( \hat{B}R + LY). \end{align}Let \(\tilde{A}_{o} = \hat{A} - \hat{B}K - L\hat{C}\). Then the control is given by
\begin{align} U(z) &= -K(zI - \tilde{A}_{o} )^{-1}BR -K(zI - \tilde{A}_{o} )^{-1}LY\label{eqn:uzobs}. \end{align}Multiply both sides of \eqref{eqn:uzobs} by \(G(z)\) and substitute \(GU(z)=Y(z)\). We then have
\begin{equation} (I + GK(zI - \tilde{A}_{o} )^{-1}L)Y(z) = -G K(zI - \tilde{A}_{o})^{-1}B R(z) \end{equation}In the SISO case, we can write this as
\begin{equation} \frac{Y}{R} = \frac{-GK(zI-\tilde{A}_{o})^{-1}\hat{B}) }{1 + GK(zI - \tilde{A}_{o} )^{-1}L} \label{eqn:hclobs} \end{equation}As usual, the closed loop characteristic equation is the denominator of \eqref{eqn:hclobs}. This means that the loop gain we are interested in for stability analysis is
\begin{align} F_{loop} &= GK(zI - \tilde{A}_{o} )^{-1}L\\ &= GK(zI - \hat{A} + \hat{B}K + L\hat{C} )^{-1}L. \end{align}In the (exceedingly rare) case where the plant and model match exactly, we should expect that the poles of \eqref{eqn:hclobs}, i.e., the zeros of \(1+F_{loop}\) are exactly \(\sigma(A-BK)\) and \(\sigma(A-LC)\). We will derive this in the sequel. We note for now that \eqref{eqn:hclobs} explicitly does not assume that the plant and model match.
[1] | B. D. Anderson and J. B. Moore, Optimal Control: Linear Quadratic Methods. Englewood Cliffs, N.J.: Prentice-Hall, 1989. [ bib ] |
[2] | J. Doyle and G. Stein, “Robustness with observers,” IEEE Trans. Automatic Control, vol. 24, no. 4, pp. 607--611, Aug. 1979. [ bib | DOI ] |
[3] | J. Doyle, “Guaranteed margins for LQG regulators,” IEEE Trans. Automatic Control, vol. 23, no. 4, pp. 756--757, Aug. 1978. [ bib | DOI ] |