Allen Lu Advance: Kalman Filter for Engineer

Kalman filter 最初的發展主要在導航應用，之後在很多領域都有新的應用。本文想從不同的角度以及例子了解 Kalman filter, 更直覺 catch Kalman filter physical insight. 在日益重要的 real time 數位信號處理，可以找到廣泛的應用。

一般對於所測量的 data (包含量測的誤差或雜訊)，傳統上就是取 mean (1st order) and variance (2nd order statistics) 如下圖。可以進一步做六個標準差或信心區間的分析。

更進一步分析就是用 LMS (least mean square) curve fit 如下圖。一般用直線 (linear) 或拋物線 (quadratic) 做 least mean square error curve fit. 這是 Gauss 在研究行星軌道所發展的方法。為什麼用 linear or quadratic curve: 一是簡單 (視覺和 math)，二是在局部近似的效果很好 (like Taylor expansion).

Kalman (或是某位俄國數學家)不凡之處就是在於更進一步 (i) 探索 data 的內在結構 (state equation and noise covariance matrix), 以及 (ii) 發展出 recursive equation 來 estimate 內在的結構 (state or sometimes noise).

其實 (i) and (ii) 是一體兩面: Kalman 假設 Markovian state model, 意即目前的 state 只 depends on 前一刻 state, 和更早的 state uncorrelated. 因此在解這個包含所有 time step 的 LMS matrix equation，這個 matrix 是 blocked tri-diagonal. 使用Ricatti recursion 求解時就會是 recursive form. 以上的說法非常 hand waving, 可以 refer 到 MIT 的 video.

(ii) Recursive equation 的好處顯而易見: estimation 不需要儲存所有的 data。事實上，只需要前一刻的必要 data, 加上目前的 measurement 就足夠。這非常適合 real time 的應用 (如導航)。

至於 (i) 探索 data 的內在結構，理想上可以更精確而且有效率的 estimate or even predict data. 實際上遇到的困難是 (a) 這樣的 state equation 是否存在? (b) 如果存在如何找到正確的 model and 參數? and (c) 如果 model or 參數不準確，Kalman filter estimate 的結果是否 robust?

Kalman 考慮的 State equation:

1. X(k)=F X(k-1)+G U(k)+W(k) (covariance Q)

Measurement:

2. Z(k)=H X(k)+V(k) (covariance R)

基本上是 linear, Markovian, H may be partially observable --> Hidden Markovian. 圖示如 Fig. 1.

上述公式看來比起一般 LPF 複雜很多。其實重點是 Kalman filter 是一個 Gaussian pdf estimator. 我們要 estimate 是 Prob(Xk | Y1, Y2, … Yk) 的 mean and variance. mean 就是 X(k|k), variance 就是 P(k|k). 分為兩步: 第一步是 estimate Prob(Xk | Y1, Y2, .. Yk-1), mean and variance are (1) and (2). 再來計算 Kalman gain (4), 以及 Prob(Xk | Y1, Y2, … Yk), mean and variance are (3) and (5).

Note that Kg is NOT a constant, changing based on equation (4). F, H, and G are time invariant matrix most of time. Usually, Kg and P can be divided into two regions: (i) transient region; and (ii) steady-state region.

From Wiki

Because the certainty of the measurements is often difficult to measure precisely, it is common to discuss the filter's behavior in terms of gain. The Kalman gain is a function of the relative certainty of the measurements and current state estimate, and can be "tuned" to achieve particular performance. With a high gain, the filter places more weight on the measurements, and thus follows them more closely. With a low gain, the filter follows the model predictions more closely, smoothing out noise but decreasing the responsiveness. At the extremes, a gain of one causes the filter to ignore the state estimate entirely, while a gain of zero causes the measurements to be ignored.

Clearly from the figure: K=0 ignores measurement z completely.
How to explain K=1 ignore state prediction from the figure?

Hand waving explanation of Kalman filter:

1. State prediction based on equation. This prediction contains noise W.

2. Measurement contains noise V.

3. Trust state or observation? based on the covariance of state prediction; and covariance of observation. If lots of noise in state prediction, lean more to observation; or vice versa. The ratio of covariances determines Kalman gain.

Feedback Explanation

Assuming G=0, F=A, H=C, Lt = Kg, Kalman filter 可以簡化為一個 feedback loop 如下圖。先考慮最簡單的 case:

A is a scaler (1x1). The plant 本身包含一個 integrator with feedback (Kalman low pass filter). The feedback controller Lt (Kg) behaves as a P (Proportion) controller.
The close loop response = Lt Z^-1/[1-(1-Lt)Z^-1] ~ Lt/(Lt+s).
Lt (Kg) 大則 loop gain and loop bandwidth 大。yt 沒有太多的 filtering, trust more on yt (i.e. measurement). (Lt=Kg=1 Kalman filter = Z^-1 no filtering at all)
Lt (Kg) 小則 loop gain and loop bandwidth 小。yt 會被 filtering, trust more on state prediction. (Lt=Kg --> 0 Kalman filter becomes an integrator)

更複雜的 case: A = [1, 1; 1, 0] (as case 3 below), the feedback controller behaves as a PI (Proportion and Integrator) controller. The detailed analysis 可參考前文: Kalman filter and PLL.

Kalman filter 和一般 feedback controller (P/PI/PID) 一個非常重要的差別是 Kg 是由 Eqs (1)-(5) 所控制，Kg 基本是 time varying and depends on the 內在結構 (state and noise covariance matrix), 後面有一些例子 Kg 是如何 time varying. 傳統的 P/PI/PID controller 基本上是 (i) 固定值或是 (ii) 分兩段 (acquisition and tracking), roughly 對應到 Kalman filter transient and steady-state 兩階段，當然只是簡化版。

Handwaving supplement

1. Given measurement z(0), z(1), z(2), .... z(k-1), estimate x(0), x(1), ..., x(k-1), x(k) and update after z(k) is received.

2a. KF is Markovian: X(k) depends on X(k-1), not X(k-2) ... and W(k), V(k). Therefore X(k|k-1, ..1, 0) = X(k|k-1)
2b. KF is recursive: statistics of X(k) (covariance) depends on k-1.
2c. Observable Z = H X usually has less degree of freedom than X; therefore hidden Markov.

3. KF is linear: X(k) = A X(k-1) + B U(k) + W(k)

Matrix Explanation

http://videolectures.net/mit18085f07_strang_lec16/

Some examples

Trivial case 0: (perfect measurement)

X(k)=A X(k-1)+B U(k)+W(k)

Z(k) = X(k) Covariance matrix R = 0

Given Z(0), Z(1), ..., Z(k) without measurement errors

The best estimation of X(k) is simply Z(k)

H Kg(k) = I; if H=I then Kg(k) = 1; P(k|k) = 0

Therefore, Kg=1 means perfect measurement; trust measurements and ignore state prediction completely.

Case 1: (constant state/no state noise, with measurement noise)

X(k) = X(k-1)

Z(k) = X(k) + V(k)

Given Z(0), Z(1), ..., Z(k) and V(k) is i.i.d. with covariance matrix R

The best estimation is: Z(0) + ... + Z(k) / (k+1)

More average reduces the variance of the estimation

P(0|0) = R

P(k|k-1) = P(k-1|k-1) = R / k

P(k|k) = P(k|k-1) * k / (k+1)

Kg(k) = P(k|k-1) / (P(k|k-1) + R) = 1 / (k+1)

P(k|k) = P(0|0) / (k+1) = R / (k+1)

When k becomes large, the state prediction becomes more trustable and the new measurement contains very little new information, Kg --> 0 and P --> 0. 從 feedback loop bandwidth 角度來看，就是愈來愈小。

Let's estimate a constant signal of 1. The noise is normal distributed with a variance of 25. Therefore the SNR is only 1/25=0.04 (-28dB), which is very hard to give a good estimate of the value. The Kalman filter doesn't seem to care much, after a few ten's samples he gives a very good estimate (Fig.1). The Kalman gain, Kg(k) (Fig.2) and estimation error, P(k|k) (Fig.3) are inversely proportional to k as expected. Fig.4 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter. Again, Kalman filter is very effective in this case.

Fig. 1

Fig.2

Fig.3

Fig.4

Case 2: (State with noise accumulation like random walk or phase noise, with measurement noise)

X(k) = X(k-1) + W(k)

Z(k) = X(k) + V(k)

Given Z(0), Z(1), ..., Z(k) and W(k), V(k) is i.i.d. with covariance matrix Q, R

X(1) = X(0) + W(1)

X(2) = X(0) + W(1) + W(2)

.. X(k) = X(0) + sum(W(k))

==> When k is large, the state behaves like random walk. Moreover, there are measurement noise on the top of the random walk.

The best estimation is: ?

P(k|k-1)=P(k-1|k-1) +Q

H = I

Kg(k)= P(k|k-1) / (P(k|k-1) + R) = (P(k-1|k-1) + Q ) / (P(k-1|k-1) + R + Q)

P(k|k) = (I-Kg(k)) P(k|k-1) = RP(k|k-1)/(P(k|k-1)+R)

=> 1/P(k|k) = 1/P(k|k-1) + 1/R (new sample is useful to reduce variance!)

1/P(k|k) = 1 / (P(k-1|k-1) + Q) + 1/R

P(k|k) = (P(k-1|k-1)+Q) R / ( P(k-1|k-1) + Q + R)

P(0|0)

P(1|0) = P(0|0) + Q

P(1|1) = (P(0|0)+Q)R / (P(0|0) + Q + R)

In the steady state, P(k|k) --> P = 1/2(sqrt(Q^2+4QR)-Q) ; Kg converges to (P+Q)/(P+R+Q)

If R is small, P --> R

Kg and P have no close forms. Kg and P are monotonic descending and converging to a constant (steady-state). 從 loop bandwidth 角度來看，就是變小最後收歛到一個固定值。這個固定值 depends on Q and R.

Fig. 5 shows the random walk of state with variance 0.1 (in blue), with measurement noise of variance 25 as case 1 (in green), and after Kalman filter (in red). Both Kg and P converge to steady state quickly shown in Fig. 6 and Fig. 7. Fig. 8 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter. Apparently, the estimate error is no longer zero as in the Case 1 even with a small perturbation state noise of 0.1. P converges to 1.53; Kg converges to 0.061.

Fig.5

Fig.6

Fig.7

Fig.8

Even though there is no close forms of Kg and P (close form only for steady-state), a simple resistor ladder illustration is useful. Po is the end resistor, serial with Q, and parallel with R, and so on.

If Q = 0; P(k) = R/k // P(0)
If R -->0; P(k) --> 0

Case 3: (position and velocity estimation, no force with noise)

p''(t) = 0 (no acceleration) => [p, p']' = [0 1; 0 0][p, p'] = A [p,p']

Transition matrix = exp(A dt) for discrete time
exp(A dt) = I + A dt + A^2/2! + A^3/3! + ... A^2=A^3=..=0
exp(A dt) = [1 dt; 0 1]

To verify the equation, assuming no noise:

p(k+1) = p(k) + p'(k) dt

p'(k+1) = p'(k) ---> constant velocity

H = [1 0] only observe position

z(k) = p(k) --> observable

這一點非常重要! 不需要事先知道 velocity (no prior knowledge of velocity), 就可以從 position measurement estimate 出來!

Case 4: (phase and frequency estimation, no cycle slip)

Case 4a: (phase and frequency estimation, with cycle slip, mod)

Case 5: Oscillation etimation
Case 6: Nonlinear estimation (sign estimation)

2. Kalman filter is a special case of Baysian recursive filter!

Some applications

DCOC (dc offset cancellation)

Baseline wander

sine wave estimation (extended Kalman filter)
multi-bit SAR (successive approximation ADC)

Allen Lu Advance

2013年11月17日星期日