Kalman filter 最初的發展主要在導航應用,之後在很多領域都有新的應用。本文想從不同的角度以及例子了解 Kalman filter, 更直覺 catch Kalman filter physical insight. 在日益重要的 real time 數位信號處理,可以找到廣泛的應用。
一般對於所測量的 data (包含量測的誤差或雜訊),傳統上就是取 mean (1st order) and variance (2nd order statistics) 如下圖。可以進一步做六個標準差或信心區間的 分析。
更進一步分析就是用 LMS (least mean square) curve fit 如下圖。一般用直線 (linear) 或拋物線 (quadratic) 做 least mean square error curve fit. 這是 Gauss 在研究行星軌道所發展的方法。為什麼用 linear or quadratic curve: 一是簡單 (視覺和 math),二是在局部近似的效果很好 (like Taylor expansion).
Kalman (或是某位俄國數學家)不凡之處就是在於更進一步 (i) 探索 data 的內在結構 (state equation and noise covariance matrix), 以及 (ii) 發展出 recursive equation 來 estimate 內在的結構 (state or sometimes noise).
其實 (i) and (ii) 是一體兩面: Kalman 假設 Markovian state model, 意即目前的 state 只 depends on 前一刻 state, 和更早的 state uncorrelated. 因此在解這個包含所有 time step 的 LMS matrix equation,這個 matrix 是 blocked tri-diagonal. 使用Ricatti recursion 求解時就會是 recursive form. 以上的說法非常 hand waving, 可以 refer 到 MIT 的 video.
(ii) Recursive equation 的好處顯而易見: estimation 不需要儲存所有的 data。事實上,只需要前一刻的必要 data, 加上目前的 measurement 就足夠。這非常適合 real time 的應用 (如導航)。
至於 (i) 探索 data 的內在結構,理想上可以更精確而且有效率的 estimate or even predict data. 實際上遇到的困難是 (a) 這樣的 state equation 是否存在? (b) 如果存在如何找到正確的 model and 參數? and (c) 如果 model or 參數不準確,Kalman filter estimate 的結果是否 robust?
Kalman 考慮的 State equation:
1. X(k)=F X(k-1)+G U(k)+W(k) (covariance Q)
Measurement:
2. Z(k)=H X(k)+V(k) (covariance R)
基本上是 linear, Markovian, H may be partially observable --> Hidden Markovian. 圖示如 Fig. 1.
Kalman Filter Implementation:
X(k|k-1)=F X(k-1|k-1)+G U(k) ……….. (1)
P(k|k-1)=F P(k-1|k-1) F'+Q ……… (2)
X(k|k)= X(k|k-1)+Kg(k) (Z(k)-H X(k|k-1)) ……… (3)
Kg(k)= P(k|k-1) H' / (H P(k|k-1) H' + R) ……… (4)
P(k|k)=(I-Kg(k) H)P(k|k-1) ……… (5)
Note that Kg is NOT a constant, changing based on equation (4). F, H, and G are time invariant matrix most of time. Usually, Kg and P can be divided into two regions: (i) transient region; and (ii) steady-state region.
Because the certainty of the measurements is often difficult to measure precisely, it is common to discuss the filter's behavior in terms of gain. The Kalman gain is a function of the relative certainty of the measurements and current state estimate, and can be "tuned" to achieve particular performance. With a high gain, the filter places more weight on the measurements, and thus follows them more closely. With a low gain, the filter follows the model predictions more closely, smoothing out noise but decreasing the responsiveness. At the extremes, a gain of one causes the filter to ignore the state estimate entirely, while a gain of zero causes the measurements to be ignored.
- Clearly from the figure: K=0 ignores measurement z completely.
- How to explain K=1 ignore state prediction from the figure?
1. State prediction based on equation. This prediction contains noise W.
Feedback Explanation
Assuming G=0, F=A, H=C, Lt = Kg, Kalman filter 可以簡化為一個 feedback loop 如下圖。先考慮最簡單的 case:
A is a scaler (1x1). The plant 本身包含一個 integrator with feedback (Kalman low pass filter). The feedback controller Lt (Kg) behaves as a P (Proportion) controller.
The close loop response = Lt Z^-1/[1-(1-Lt)Z^-1] ~ Lt/(Lt+s).
Lt (Kg) 大則 loop gain and loop bandwidth 大。yt 沒有太多的 filtering, trust more on yt (i.e. measurement). (Lt=Kg=1 Kalman filter = Z^-1 no filtering at all)
Lt (Kg) 小則 loop gain and loop bandwidth 小。yt 會被 filtering, trust more on state prediction. (Lt=Kg --> 0 Kalman filter becomes an integrator)
更複雜的 case: A = [1, 1; 1, 0] (as case 3 below), the feedback controller behaves as a PI (Proportion and Integrator) controller. The detailed analysis 可參考前文: Kalman filter and PLL.
Handwaving supplement
2a. KF is Markovian: X(k) depends on X(k-1), not X(k-2) ... and W(k), V(k). Therefore X(k|k-1, ..1, 0) = X(k|k-1)
2b. KF is recursive: statistics of X(k) (covariance) depends on k-1.
2c. Observable Z = H X usually has less degree of freedom than X; therefore hidden Markov.
3. KF is linear: X(k) = A X(k-1) + B U(k) + W(k)
Matrix Explanation
http://videolectures.net/mit18085f07_strang_lec16/
Some examples
Let's estimate a constant signal of 1. The noise is normal distributed with a variance of 25. Therefore the SNR is only 1/25=0.04 (-28dB), which is very hard to give a good estimate of the value. The Kalman filter doesn't seem to care much, after a few ten's samples he gives a very good estimate (Fig.1). The Kalman gain, Kg(k) (Fig.2) and estimation error, P(k|k) (Fig.3) are inversely proportional to k as expected. Fig.4 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter. Again, Kalman filter is very effective in this case.
|
Fig. 1 |
Fig.2 |
Fig.3 |
Fig.4 |
Case 2: (State with noise accumulation like random walk or phase noise, with measurement noise)
X(k) = X(k-1) + W(k)
Z(k) = X(k) + V(k)
Given Z(0), Z(1), ..., Z(k) and W(k), V(k) is i.i.d. with covariance matrix Q, R
X(1) = X(0) + W(1)
X(2) = X(0) + W(1) + W(2)
.. X(k) = X(0) + sum(W(k))
==> When k is large, the state behaves like random walk. Moreover, there are measurement noise on the top of the random walk.
The best estimation is: ?
P(k|k-1)=P(k-1|k-1) +Q
H = I
Kg(k)= P(k|k-1) / (P(k|k-1) + R) = (P(k-1|k-1) + Q ) / (P(k-1|k-1) + R + Q)
P(k|k) = (I-Kg(k)) P(k|k-1) = RP(k|k-1)/(P(k|k-1)+R)
=> 1/P(k|k) = 1/P(k|k-1) + 1/R (new sample is useful to reduce variance!)
1/P(k|k) = 1 / (P(k-1|k-1) + Q) + 1/R
P(k|k) = (P(k-1|k-1)+Q) R / ( P(k-1|k-1) + Q + R)
P(0|0)
P(1|0) = P(0|0) + Q
P(1|1) = (P(0|0)+Q)R / (P(0|0) + Q + R)
In the steady state, P(k|k) --> P = 1/2(sqrt(Q^2+4QR)-Q) ; Kg converges to (P+Q)/(P+R+Q)
If R is small, P --> R
Kg and P have no close forms. Kg and P are monotonic descending and converging to a constant (steady-state). 從 loop bandwidth 角度來看,就是變小最後收歛到一個固定值。這個固定值 depends on Q and R.
Fig. 5 shows the random walk of state with variance 0.1 (in blue), with measurement noise of variance 25 as case 1 (in green), and after Kalman filter (in red). Both Kg and P converge to steady state quickly shown in Fig. 6 and Fig. 7. Fig. 8 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter. Apparently, the estimate error is no longer zero as in the Case 1 even with a small perturbation state noise of 0.1. P converges to 1.53; Kg converges to 0.061.
Fig.5 |
Fig.6 |
Fig.7 |
Fig.8 |
Even though there is no close forms of Kg and P (close form only for steady-state), a simple resistor ladder illustration is useful. Po is the end resistor, serial with Q, and parallel with R, and so on.
If Q = 0; P(k) = R/k // P(0)
If R -->0; P(k) --> 0
Case 3: (position and velocity estimation, no force with noise)
p''(t) = 0 (no acceleration) => [p, p']' = [0 1; 0 0][p, p'] = A [p,p']
Transition matrix = exp(A dt) for discrete time
exp(A dt) = I + A dt + A^2/2! + A^3/3! + ... A^2=A^3=..=0
exp(A dt) = [1 dt; 0 1]
To verify the equation, assuming no noise:
p(k+1) = p(k) + p'(k) dt
p'(k+1) = p'(k) ---> constant velocity
H = [1 0] only observe position
z(k) = p(k) --> observable
這一點非常重要! 不需要事先知道 velocity (no prior knowledge of velocity), 就可以從 position measurement estimate 出來!
Case 4: (phase and frequency estimation, no cycle slip)
Case 4a: (phase and frequency estimation, with cycle slip, mod)
Case 6: Nonlinear estimation (sign estimation)
multi-bit SAR (successive approximation ADC)