2014年1月17日 星期五

MMSE estimator 和 ML estimator 差異


ML estimator 是 Maximum Likelihood estimator (最有可能)

MMSE estimator 是 Minimum Mean Square Error estimator (最小平方誤差)

嚴格來說,ML estimation 是  classic estimation.  MMSE 是 Bayesian estimation.  請另文參考。本文主要是以應用來比較。

這兩種 estimator 在實際應用上常遇到。 到底差異在那? 以及 when to apply what?
ML example: match filter (maximize decision SNR), digital communication detection (minimum BER- Bit Error Rate), etc.    ML 一般是在 no prior information 時的最佳選擇。

MMSE example: Wiener filter (minimize error, maximize total SNR), Kalman filter (minimize error), etc.   MMSE = min E(|x-x*|^2) 直譯為 minimum mean square.  通常 reserve 給 Bayesian estimation quardratic cost function with prior information.

簡單來說,ML estimator 主要應用是 estimate parameters based on what is most likely to create the oberved samples (但 parameter 本身 no prior information).

MMSE estimator 主要差別是 estimate waveform (Wiener filter) or sequences (Kalman filter) based on samples to minimize a cost function (mean square error function).

當然以上只是粗略的區分。也有 sequence ML estimator (Viterbi decoding algorithm 就是有名的例子)。反之 MMSE 也常用來 estimate parameters, 請 reference 前文 “Kalman filter: MMSE estimator”.

簡單例子 (類比通信和數位通信)

類比通信如 AM 或 FM, 接收時會被 noise 污染,理想的類比接收器就是 MMSE filter (如 Wiener filter), 儘可能還原原始的類比信號。實際上為了成本的考量,一般都用簡單的 LPF.  ML estimator 在此就無用武之地。


數位通信如 PSK 或 FSK, 接收時也會被 noise 污染,但理想的接收器不是儘可能還原原始信號。如下圖所示,重點在 maximize decision moment 時的 SNR, 才能準確判斷收到是 0 or 1. 即使 waveform 有 distortion 也無妨,因為數位通信並不 care waveform. 理想的數位接收器就是 ML estimator (match filer), 能準確判斷收到(或送出)是 0 or 1.   MMSE estimator and ML estimator 在這個例子顯然不同。

image

一個反例 (MMSE and ML estimator 相同結果): Linear Regression

==> 以下的例子並非是 MMSE, least square function or least square estimation (LSE) .  常常容易混淆 MMSE and least square optimzation, 以下誤導大家。

Least square estimation (LSE) and ML estimation 的差異可見本文

在 linear regression estimation, the goal is to estimate y = b_0 + b_1 x 中的 b_0 and b_1

image

注意這是 estimate parameters b_0 and b_1, 不是如 Kalman filter estimate waveform.

image

先 define square error function 如下:

image 

image

image

 

ML 的方法是先 define likelihood function 如下:

image

Then find what theta is most likely given Xi, i.e. find maximum of likelihood function.

image

更常用的是 log-likelihood function, take log of the likelihood function 再微分。

 

Assuming epi_i is Gaussian with i.i.d:

image

How to maximize the likelihood function?  It tunrs out equivalent to minimize

image

 

以下為錯誤結論 :

In this case, MMSE and ML estimation are the same!!

到底  MMSE 和 ML 在 parameter estimation 有什麼不同,還是每次都會有相同的結果?  從上面的例子,MMSE estimator 和 error (epson_i) 的 pdf 沒有關係,但 ML estimator 明顯 depends on pdf. 

舉例而言,如果 eps_i 不是 Gaussian, 而是 exponential distribution, ML 要 maximize likelihood function 就不會 equivalent to minimize “mean absolute error” (沒有 square). 

In general, MMSE estimator 不管 error 的 pdf, 基本上只 care 1st order and 2nd order statistics (mean and variance).   換言之,就是假設 error pdf 是 Gaussian, MMSE estimator 就是 optimal estimator and achieve the Cramer-Rao lower bound (CRLB).

反之,ML estimator 會針對 error pdf (prior information) 做 optimization.  理論上會比 MMSE 更優?  如果 error pdf 是 Gaussian,  MMSE and ML 就會得到相同的結果。

以下為正確的結論:

1. 上述 MMSE 應改為 least square estimation (LSE), 並沒有 mean (prior pdf) 的觀念。

2. ML estimator depends on pdf (稱為 likelihood function). 但 parameter 本身 no prior information.

3. Depending on the likelihood function, ML 需要 optimize 的 function 不同。For Gaussian likelihood function, 即為 least square function.   For other likelihood function,  就會有不同的 minimal function.

 

另一個例子 (Sine Wave Estimation)

Sine wave estimation 有兩種 flavors: (i) Parameter estimation (假設 amplitude, frequency, phase are fixed); (ii) Waveform estimation (amplitude, frequency, phase can be time varying).

For (ii) waveform estimation, 很顯然只適合 MMSE estimator.  For (i) parameter estimation, ML and MMSE 皆可用。

Sine Wave Parameter Estimation

這個例子在 digial signal processing 常常用到。Estimate 一個 sine wave 的頻率和 phase,不過這個 sine wave 被 additive Gaussian noise 所污染。這個問題有很多變形:sine wave 的 IQ imbalance; sine wave carrier synchronization; sine wave 包含了 harmonics; 或是 harmonic level 的 estimation, etc.

基本上有兩種做法:(i) Time domain or (ii) Frequency domain. 

Time Domain Linear Method

Time domain method is mainly based on linear regression estimation.  It shows accurate result at moderate to high SNRs.

Tretter suggests the following method:

image

以上 eq. (3), estimate wo and theta using x_t,  t=0, 1, . ,N-1.  可以証明 ML estimator and MMSE estimator are equivalent.   wo and theta are obtained by the equations in linear regression example.  

There are two problems: (i) eq. (2) is only approximate model and accurate only at moderate to high SNRs.; (ii) (3) needs to do phase unwrapped and it can be error prone.

One variation of the sine wave estimation is coherent sampling.  That is, the sampling frequency is integer multiple of the sine wave.  In this case, the sine wave frequency is known, the only parameter to be estimated is the phase.  Therefore, the phase is inversely proportional to the N^2.  

The following is the math of problem statement, CRLB, and Tretter’s model

image 

image

 

image

 

The following is the Tretter’s model.

image

 

 

 

Tretter’s algorithm involves unwrap(arctan) operation and achieve CRLB at a high SNR.  It can be used for IQ imbalance estimation withour prior knowledge of the frequency.  (IQ is arctan function).

In summary, Tretter’s algorithm

Pros: (i) No FFT, lower complexity;  (ii) Sequence estimation, low latency

Cons: (ii) phase unwrap add additional complexity and may fail at low SNR

 

Kay improves the algorithm by using the differenced phase of two adjacent samples. 

image

It is shown that Tretter and Kay’s algorithms are equivalent via Itoh’s phase unwrapping algorithm.

 

 

Time Domain Non-Linear Method

Zero crossing method, not surprising since arctan is nonlinear function.  We can avoid the arctan by using zero crossing

Can it generalize to non-sine waveform?  square wave or pulse like waveform?  It may be useful for asynchronous eye diagram analysis???

Frequency Domain

D.C. Rife

 

Waveform example:

MMSE is used due to close resamble the original waveform!

It is NOT required to be a fixed sine wave.  Some parameter could be time varying.

Example is like FM demodulation.  How about zero crossing MMSE for FM demodulation?

 

A sin(wt + theta)

For MMSE estimator

For ML estimator, estimate A, w, or theta

2014年1月5日 星期日

Kalman Filter : MMSE Estimator

常常說 Kalman filter 是 optimal estimate (in linear dynamic model), 如何解釋? 或者更基本,Kalman filter 的理論基礎為何? Kalman filter 和一般常用的 LPF (Butterworth, Cheby filter, etc.) 有什麼本質上的差異?

Kalman filter 和常用 LPF 差異

一般信號處理常用的 filter (FIR, IIR, Butterworth, Cheby) based on Laplace or Fourier tranform , (i) 都是 linear time invariant (LTI) filter;  (ii) filter 的特性 (e.g.頻寛)和輸入信號基本上 independent.

要計算輸出信號,我們通常先 derive (or define) filter 的 transfer function and impulse response, 再用 frequency domain (s/z-domain) multiplication 或 time domain convolution 來獲得。

Kalman filter, on the other hand, 是由 stochastic signal (random sequence 隨機信號) estimation 出發 (i) 輸入信號和相關雜訊的 statistics 決定了 Kalman filter 的特性;  (ii) Kalman filter 基本上是 time varying filter.  (在很多情況下 Kalman filter 會收歛到 steady state filter, 近似 time invariant filter).  以上兩點和常用 LPF 完全不同。

Kalman filter 和常用 LPF 可以類比嗎?

前文 ( Kalman filter for Engineer) 提到 Kalman filter 可以視為一個 feedback controller with time varying Kalman gain (Kg).  因此可以視為一個 time varying IIR filter, 頻寛會隨時間改變。

在 steady state (如果 steady state 存在)時,Kg 基本上近似常數,趨近一個 time invariant filter).  不過 Kg 由輸入信號和相關雜訊決定,和一般 LPF 仍有不同。

Kalman filter 和 Wiener Filter 的差異

可以 reference Binghamton 的 slides.

Wiener filter assumes WSS (wide sense stationary) signal + noise

Kalman filter assumes Dynamic model + observation noise

Wiener filter 是 time invariant filter, 但 filter 特性由輸入信號和雜訊決定,這點和 Kalman filter 相似。

If we let consider only after much time has elapsed: given IIR Wiener case and steady state Kalman filter, then Kalman = Wiener.

Step1: Random Variable MMSE Estimator

先從最簡單的 one variable 開始,X 的 MMSE (Minimum Mean-Square Error) estimator 是 find x* to minimize E(|X-x*|^2).  很容易可証明: x* = E(X),  MSE=E(|X-E(X)|^2)=σ^2.        Any other estimate E(X)+a 的 MSE=σ^2 + a^2 > σ^2.

當有 two variables X, Y, the MMSE (Minimum Mean-Squared Error) estimator:
Let y be observed.   Fnd x* an estimate of the unobserved x to minimize  E(|X-x*|^2).  Hint: E(|X-x*|^2) = Ex Ey (|X-x*|^2|y)

很明顯如果 X, Y are independent or uncorreleated. y is observed or not 對 X estimation 並沒有任何幫助:

x* = E(X) and MSE = σ_x^2.

如果 X and Y are correlated, y is observed 對 X estimation 多少有幫助。可以証明 x* = E(X|Y=y) 是 MMSE estimator. 並且 MSE error 會比沒有 y is observed 小。直覺上 make sense, 多了 y 的資訊對減小 X esimation error 多少有幫助。

注意 x* = E(X|Y=y) 並沒有假設任何 X and Y 的統計特性 (如 Gaussian)!  這是 Baysian estimator 的基礎。

X, Y Gaussian

再來假設 X, Y Gaussian 來進一步說明

[x, y] ~ N( [ E(X), E(Y)], [σ_x^2, σ_xy; σ_xy, σ_y^2])

x* = E(X|Y=y) = E(X) + σ_xy (y-E(y)) / σ_y^2

MSE = σ_x^2 – |σ_xy|^2/σ_y^2  < σ_x^2

(i) x* = E(X|Y=y) = a + b y  是一個 linear function of y

(ii) MSE is reduced with additional observable y

(iii) If X, Y are uncorrelated, σ_xy = 0.  x* = E(X); MSE = σ_x^2 as expected

X, Y Non-Gaussian

在 X and Y 非 Gaussian 情況下要得到 x* = E(X|Y=y) 就比較困難。上述 Gaussian 結論需要修正:

(i) x* = E(X|Y=y) = g(y) in general in NOT a linear function of y.  不過可以証明在所有的 linear function (a + b y) 中,之前 Gaussian 所得的公式:

x* = E(X) + σ_xy (y-E(y)) / σ_y^2

MSE 是最小的,稱為 LMSE and LMMSE (Linear Minimum Mean Square Error).   同時可以証明 LMMSE 的確是 minimum (at least local minimum) and unbias.

但是可能存在其他非線性的  x* = g(y) 的 MSE 更小 (global minimum), some example are here.

(ii) and (iii) 仍然是正確。

Step2: MMSE with linear measurement

Consider the following special case (Y is linear transformation x plus another Gaussian noise)

y = H x + v,  x~N(E(X), σ_x^2),  v~N(E(V), σ_v^2)

x* = EX+ K (y-EY)   where EY = H EX + EV, 

K = σ_x^2 H’ (H σ_x^2 H’ + σ_v^2) ^-1  from equation of last section.  K 事實上就是 Kalman gain.

MSE = σ_x^2 – K * H σ_x^2 = σ_x^2 (1 – K H)   Therefore, MSE is reduced due to y is observed.

The above equation forms the base of Kalman filter. 

 

Step3: MMSE with sequences of linear measurement

y = Hx + v    given y1, y2, y3, … yk

x*(k) = E(X|y1, y2, y3, ..yk) =  E(X|y1|y2| .. |yk)

-> x*(1) = E(X|y1) = EX + K(1) (y1 – EY)

EY=H EX+EV  MSE(1)=P(1)=sig_x^2 H’(H sig_x^2 H’+ sig_v^2)^-1

K(1) = sig_x^2 H’ (H sig_x^2 H’ + sig_v^2) ^-1

-> x*(2) = E(X|y1|y2) = E(X|y1) + K(2)(y2 – EY(2))

                = x*(1) + K(2)(y2 – EY(2))

E(X|y1)=x*(1)     EY(2)=Hx*(1)+EV   MSE(2)=P(2)=P(1)^2 H’(HP(1)^2H’+sig_v^2)^-1  K(2) = sig_x^2 H’ (H sig_x^2 H’ + sig_v^2) ^-1

-> x*(3) = x*(2) + K(2)(y2 – EY(2))

其實上述公式就是 Kalman filter 的雛型。只要再加上 state  dynamic model  x(k+1) = A x(k) + W  就變成 Kalman filter, 這部份先帶過。  反之 Kalman filter 的 state equation 簡化為 x(k+1) = x(k), 就變回 sequencial MMSE estimator. 

Sequencial MMSE estimation 和 Kalman filter 一個很大的不同是沒有 W (state process noise).  因此愈多的 measurements,  MSE error 會一直減小,最後會接近 0, 也不會有 steady state. 

在 Kalman filter 有 W (state process noise).  在 prediction stage 時,MSE noise 會增加,update stage 時,MSE noise 才會減少。因此 Kalman filter MSE 最後會收歛到一個非零的 steady state. 圖示如下。

image

 

MMSE with nonlinear measurement

y = f(x) + v

直覺的 approach, 就是用 Taylor expansion 來 approximate f(x).   However, there are two problems:

(i) In general, y is no longer Gaussian 上述的公式不適合。

(ii) For some common nonlinear function f(x) may introduce bias to disturb the estimation.

除了 Taylor expansion 外,另一方法是用 Gaussian 近似 f(x) 的 pdf.  一個方法是 unscented transform (UT).  之後再介紹。

2013年12月29日 星期日

Kalman Filter Example: Water Tank

以下的 water tank example is from Cornell


Case 1: Constant water level with some measurement noise.
If we use the following Kalman filter with q=0.0001 and r=0.1, we get the following result that match with the true water level.


Case 2: 如果實際的 water level 是隨時間增加,但仍然用以上的 Kalman filter。意即 model error, 會發生什麼狀況?
可以看到 Kalman filter output 的誤差愈來愈大而發散,原因是 Kalman filter model 和實際有落差。如何解決這個問題: (1) 放大 q, state prediction 有更多的自由度; (2) 用 2D model 包含速度。

先看 (1), q=0.01 and r=0.1
有改善但仍然不夠。Try q=r=0.1, 可以追上但 (a) noise 增加; (b) q 需要 try and error, 不實用。
q=1 and r=0.1, 基本上 follow measurement (Kg~1).


正解是用 2D model 包含速度:


qf = 0.00001 and r=0.1 注意 Kalman filter 會自動收斂到對的速度, regardless initial condition.

qf = 0.00001 and r=0.1 for constant tank level. (velocity=0)
Case 3: 如果 water level 是 oscillation, 2D with position and velocity 會如何?
1. Estimated state is smoother but lags behind true value. 這和前面用 static model constant velocity 產生的 lag 類似。代表 model 有問題。
2. The amplitude of the estimated state is getting smaller and smaller.
This is because the model is slowly converging to what it thinks is the truth... a constant level, which is accurate over time.

需要 extended Kalman filter 才能解以上的問題。

2013年11月17日 星期日

Kalman Filter for Engineer

Kalman filter 最初的發展主要在導航應用,之後在很多領域都有新的應用。本文想從不同的角度以及例子了解 Kalman filter,  更直覺 catch Kalman filter physical insight.  在日益重要的 real time 數位信號處理,可以找到廣泛的應用。

一般對於所測量的 data (包含量測的誤差或雜訊),傳統上就是取 mean (1st order) and variance (2nd order statistics) 如下圖。可以進一步做六個標準差或信心區間的 分析。



更進一步分析就是用 LMS (least mean square) curve fit 如下圖。一般用直線 (linear) 或拋物線 (quadratic) 做 least mean square error curve fit.  這是 Gauss 在研究行星軌道所發展的方法。為什麼用 linear or quadratic curve:  一是簡單 (視覺和 math),二是在局部近似的效果很好 (like Taylor expansion).

 

 

Kalman (或是某位俄國數學家)不凡之處就是在於更進一步 (i) 探索 data 的內在結構 (state equation and noise covariance matrix), 以及 (ii) 發展出 recursive equation 來 estimate 內在的結構 (state or sometimes noise).

其實 (i) and (ii) 是一體兩面:  Kalman 假設 Markovian state model, 意即目前的 state 只 depends on 前一刻 state, 和更早的 state uncorrelated.  因此在解這個包含所有 time step 的 LMS matrix equation,這個 matrix 是 blocked tri-diagonal.  使用Ricatti recursion 求解時就會是 recursive form.  以上的說法非常 hand waving, 可以 refer 到 MIT 的 video.

(ii) Recursive equation 的好處顯而易見: estimation 不需要儲存所有的 data。事實上,只需要前一刻的必要 data, 加上目前的 measurement 就足夠。這非常適合 real time 的應用 (如導航)。

至於 (i) 探索 data 的內在結構,理想上可以更精確而且有效率的 estimate or even predict data.  實際上遇到的困難是 (a) 這樣的 state equation 是否存在? (b) 如果存在如何找到正確的 model and 參數? and (c) 如果 model or 參數不準確,Kalman filter estimate 的結果是否 robust?

Kalman 考慮的 State equation:

1. X(k)=F X(k-1)+G U(k)+W(k)   (covariance Q)

Measurement:

2. Z(k)=H X(k)+V(k)   (covariance R)

基本上是 linear, Markovian, H may be partially observable --> Hidden Markovian.  圖示如 Fig. 1.

 

 













Kalman Filter Implementation:

X(k|k-1)=F X(k-1|k-1)+G U(k) ……….. (1)
P(k|k-1)=F P(k-1|k-1) F'+Q ……… (2) 
X(k|k)= X(k|k-1)+Kg(k) (Z(k)-H X(k|k-1)) ……… (3) 
Kg(k)= P(k|k-1) H' / (H P(k|k-1) H' + R) ……… (4) 
P(k|k)=(I-Kg(k) H)P(k|k-1) ……… (5)

上述公式看來比起一般 LPF 複雜很多。其實重點是 Kalman filter 是一個 Gaussian pdf estimator.  我們要 estimate 是 Prob(Xk | Y1, Y2,  … Yk) 的 mean and variance.  mean 就是 X(k|k),  variance 就是 P(k|k).  分為兩步: 第一步是 estimate Prob(Xk | Y1, Y2, .. Yk-1), mean and variance are (1) and (2).  再來計算 Kalman gain (4), 以及 Prob(Xk | Y1, Y2, … Yk), mean and variance are (3) and (5).

Note that Kg is NOT a constant, changing based on equation (4).  F, H, and G are time invariant matrix most of time.  Usually, Kg and P can be divided into two regions: (i) transient region; and (ii) steady-state region.
 
 







 







 
 
 
 
 
From Wiki

Because the certainty of the measurements is often difficult to measure precisely, it is common to discuss the filter's behavior in terms of gain. The Kalman gain is a function of the relative certainty of the measurements and current state estimate, and can be "tuned" to achieve particular performance. With a high gain, the filter places more weight on the measurements, and thus follows them more closely. With a low gain, the filter follows the model predictions more closely, smoothing out noise but decreasing the responsiveness. At the extremes, a gain of one causes the filter to ignore the state estimate entirely, while a gain of zero causes the measurements to be ignored.


  • Clearly from the figure: K=0 ignores measurement z completely.
  • How to explain K=1 ignore state prediction from the figure?
Hand waving explanation of Kalman filter:

1. State prediction based on equation.  This prediction contains noise W.
2. Measurement contains noise V.
3. Trust state or observation?  based on the covariance of state prediction; and covariance of observation.   If lots of noise in state prediction, lean more to observation; or vice versa.   The ratio of covariances determines Kalman gain.

Feedback Explanation

Assuming G=0,  F=A, H=C,  Lt = Kg, Kalman filter 可以簡化為一個 feedback loop 如下圖。先考慮最簡單的 case: 

A is a scaler (1x1).  The plant 本身包含一個 integrator with feedback (Kalman low pass filter). The feedback controller Lt (Kg) behaves as a P (Proportion) controller.  
The close loop response = Lt Z^-1/[1-(1-Lt)Z^-1] ~ Lt/(Lt+s).
Lt (Kg) 大則 loop gain and loop bandwidth 大。yt 沒有太多的 filtering, trust more on yt (i.e. measurement).  (Lt=Kg=1 Kalman filter = Z^-1 no filtering at all)
Lt (Kg) 小則 loop gain and loop bandwidth 小。yt 會被 filtering, trust more on state prediction.  (Lt=Kg --> 0 Kalman filter becomes an integrator)

更複雜的 case: A = [1, 1; 1, 0] (as case 3 below), the feedback controller behaves as a PI (Proportion and Integrator) controller.   The detailed analysis 可參考前文: Kalman filter and PLL.

 

 

 

 

 

 

 

 


 

Kalman filter 和一般 feedback controller (P/PI/PID) 一個非常重要的差別是 Kg 是由 Eqs (1)-(5) 所控制,Kg 基本是 time varying and depends on the 內在結構 (state and noise covariance matrix), 後面有一些例子 Kg 是如何 time varying.   傳統的 P/PI/PID controller 基本上是 (i) 固定值或是 (ii) 分兩段 (acquisition and tracking), roughly 對應到 Kalman filter transient and steady-state 兩階段,當然只是簡化版。


Handwaving supplement

 
1.  Given measurement z(0), z(1), z(2), .... z(k-1), estimate x(0), x(1), ..., x(k-1),      x(k) and update after z(k) is received.

2a.  KF is Markovian: X(k) depends on X(k-1), not X(k-2) ... and W(k), V(k).  Therefore X(k|k-1, ..1, 0) = X(k|k-1)
2b.  KF is recursive: statistics of X(k) (covariance) depends on k-1.
2c.   Observable Z = H X usually has less degree of freedom than X; therefore hidden Markov.

3.  KF is linear:  X(k) = A X(k-1) + B U(k) + W(k)

Matrix Explanation

 

http://videolectures.net/mit18085f07_strang_lec16/

Some examples

 
Trivial case 0: (perfect measurement)
X(k)=A X(k-1)+B U(k)+W(k)
Z(k) = X(k)       Covariance matrix R = 0
 
Given Z(0), Z(1), ..., Z(k) without measurement errors
The best estimation of X(k) is simply Z(k)
H Kg(k) = I;      if H=I  then Kg(k) = 1;     P(k|k) = 0
Therefore, Kg=1 means perfect measurement; trust measurements and ignore state prediction completely.
 
 
Case 1: (constant state/no state noise, with measurement noise)
X(k) = X(k-1)
Z(k) = X(k) + V(k)
 
Given Z(0), Z(1), ..., Z(k) and V(k) is i.i.d. with covariance matrix R
The best estimation is:  Z(0) + ... + Z(k) / (k+1)
More average reduces the variance of the estimation
P(0|0) = R
P(k|k-1) = P(k-1|k-1) = R / k
P(k|k) = P(k|k-1) * k / (k+1)
Kg(k) = P(k|k-1) / (P(k|k-1) + R) = 1 / (k+1)
P(k|k) = P(0|0) / (k+1) = R / (k+1)
 
When k becomes large, the state prediction becomes more trustable and the new measurement contains very little new information, Kg --> 0 and P --> 0.  從 feedback loop bandwidth 角度來看,就是愈來愈小。

Let's estimate a constant signal of 1. The noise is normal distributed with a variance of 25. Therefore the SNR is only 1/25=0.04 (-28dB), which is very hard to give a good estimate of the value. The Kalman filter doesn't seem to care much, after a few ten's samples he gives a very good estimate (Fig.1).  The Kalman gain, Kg(k) (Fig.2) and estimation error, P(k|k) (Fig.3) are inversely proportional to k as expected. Fig.4 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter.  Again, Kalman filter is very effective in this case.
 

Fig. 1
 
 
 
 
Fig.2
 

 

Fig.3
Fig.4

 









 

 

 

 

 

 

 

 

 

 

Case 2: (State with noise accumulation like random walk or phase noise, with measurement noise)

X(k) = X(k-1) + W(k)

Z(k) = X(k) + V(k)

Given Z(0), Z(1), ..., Z(k) and W(k), V(k) is i.i.d. with covariance matrix Q, R

X(1) = X(0) + W(1)

X(2) = X(0) + W(1) + W(2)

.. X(k) = X(0) + sum(W(k))

==>  When k is large, the state behaves like random walk.   Moreover, there are measurement noise on the top of the random walk.

The best estimation is: ?

P(k|k-1)=P(k-1|k-1) +Q

H = I

Kg(k)= P(k|k-1) / (P(k|k-1) + R) = (P(k-1|k-1) + Q ) / (P(k-1|k-1) + R + Q)

P(k|k) = (I-Kg(k)) P(k|k-1) = RP(k|k-1)/(P(k|k-1)+R)

=> 1/P(k|k) = 1/P(k|k-1) + 1/R  (new sample is useful to reduce variance!)

      1/P(k|k) = 1 / (P(k-1|k-1) + Q) + 1/R

     P(k|k) = (P(k-1|k-1)+Q) R / ( P(k-1|k-1) + Q + R)

P(0|0)

P(1|0) = P(0|0) + Q

P(1|1) =  (P(0|0)+Q)R / (P(0|0) + Q + R)

In the steady state, P(k|k) --> P = 1/2(sqrt(Q^2+4QR)-Q) ;  Kg converges to (P+Q)/(P+R+Q)

If R is small,  P --> R

Kg and P have no close forms.  Kg and P are monotonic descending and converging to a constant (steady-state).  從 loop bandwidth 角度來看,就是變小最後收歛到一個固定值。這個固定值 depends on Q and R. 

Fig. 5 shows the random walk of state with variance 0.1 (in blue), with measurement noise of variance 25 as case 1 (in green), and after Kalman filter (in red).   Both Kg and P converge to steady state quickly shown in Fig. 6 and Fig. 7.   Fig. 8 shows the measurement error ,Z(k)-X(k), in green and estimation error, X^(k)-X(k), in red after Kalman filter.  Apparently, the estimate error is no longer zero as in the Case 1 even with a small perturbation state noise of 0.1.  P converges to 1.53; Kg converges to 0.061.

Fig.5
Fig.6

 

Fig.7
Fig.8



















Even though there is no close forms of Kg and P (close form only for steady-state), a simple resistor ladder illustration is useful.   Po is the end resistor, serial with Q, and parallel with R, and so on.


If Q = 0; P(k) = R/k // P(0)
If R -->0;  P(k) --> 0















Case 3: (position and velocity estimation, no force with noise)








 

 

 

 

 

 

 

p''(t) = 0  (no acceleration) => [p, p']' = [0 1; 0 0][p, p'] = A [p,p']


Transition matrix = exp(A dt) for discrete time
exp(A dt) = I + A dt + A^2/2! + A^3/3! + ...    A^2=A^3=..=0
exp(A dt) = [1 dt; 0 1]

To verify the equation, assuming no noise:

p(k+1) = p(k) + p'(k) dt

p'(k+1) = p'(k)    ---> constant velocity

H = [1 0] only observe position

z(k) = p(k)  --> observable

這一點非常重要!  不需要事先知道 velocity (no prior knowledge of velocity), 就可以從 position measurement estimate 出來!

 

Case 4: (phase and frequency estimation, no cycle slip)

Case 4a: (phase and frequency estimation, with cycle slip, mod)

Case 5: Oscillation etimation
Case 6: Nonlinear estimation (sign estimation)

 
2.  Kalman filter is a special case of Baysian recursive filter! 
 
Some applications
DCOC (dc offset cancellation)
Baseline wander
sine wave estimation (extended Kalman filter)
multi-bit SAR (successive approximation ADC)
 

 

 

 

 

2013年8月23日 星期五

Mac OS X Mountain Lion 常用軟體 for engineer

I am writing this mainly for engineering development purpose. 工欲善其事,必先利其器。

Step 1: Xcode (download from App)

Step2: Install command line .. (CI?)    --> For gcc, etc. 

In the terminal window,  check gcc –v   (4.2.1 or later)

Then in the command window:

There seems to be a better way from Xcode app to install CI

 

Step3: Install Homebrew (and wget : no use?)

            $ brew list (list all installed packages)

Step4: Install emacs: brew install emacs

Step5: Install git (with penssl)

Step6: Install ruby with rvm (NOT use brew?)

Step7: Install python and virtualenv (use brew)

Python package management tool:  easy_install (old) and pip (new).  pip can uninstall package.  Use pip always.

Similar to “rvm” in ruby, python has “virtualenv” to install and manage different version/environment of python.

$ brew install python  --with-brewed-openssl   (install Python 2.7 and link to openssl?)

$ brew linkapps  (to link IDLE, python-launch to GUI)

$ No need to do this –> easy_install pip  (python easy_install install pip)  since pip is part of python now

$ sudo pip install --upgrade setuptools

$ sudo pip install --upgrade pip

$ pip install virtualenv (python pip install virtualenv)

 

Step8: Install numpy, scipy, matplotlib, and ipython under virtualenv

          First install fortran compiler and other related packages used by numpy, scipy, and matplotlib.

$ brew install gfortran

$ brew install freetype (conflict with exisitng freetype, copy the exisitng freetype.dylib to local/old!!)

$ brew install zmq

$ brew install libpng

 

Then install numpy, scipy, and ipython under virtualenv

$ … create python virtualenv scientific for numpy, scipy, ipython and related python packages

How to create python virtualenv?

$ mkdir .virtualenvs

$ cd .virtrualenvs

$ virtualenv scientific  --> create a python2.7 virtualenv, use -p python3 for python3

add the following lines in .bash_profile  (DO NOT put in .bashrc, mac terminal not using it!!)

# virtualenv should use Distribute instead of legacy setuptools
export VIRTUALENV_DISTRIBUTE=true
# Centralized location for new virtual environments
export PIP_VIRTUALENV_BASE=$HOME/.virtualenvs
# pip should only run if there is a virtualenv currently activated
export PIP_REQUIRE_VIRTUALENV=true
# cache pip-installed packages to avoid re-downloading
export PIP_DOWNLOAD_CACHE=$HOME/.pip/cache

## virtualenv for python version management
source /Users/alu/.virtualenvs/scientific/bin/activate

Then do the following to install numpy, scipy, etc.

(scientific)$ pip install numy

(scientific)$ pip install scipy

(scientific)$ pip install matplotlib

(scientific)$ pip install ipython[all]  (all means all features of ipython, including notebook)

(scientific)$ iptest (run ipython test suite)

 

Step8: Install matlab (ck version)

 

Other desktop software

Google Chrome

OpenVanilla (Yahoo key key 大易)

Skype

Dropbox

HoRNDIS (for USB to Android cell phone)

Picasa

HP printer driver (LaserJet)

TeamViewer

Step x: Install Qumana and Marsedit for blogger

Qumana only works for WordPress, but not working for Blogger.

Marsedit works both for WordPress and Blogger.  However, it crashed sometimes.

Step x+1: Install MacTex and TexShop for Latex related editing tasks.

2013年6月8日 星期六

Raspberry Pi 經驗分享

RPi (Raspberry Pi) 的幾個特點:(1) 比較便宜 (module B + 16G SD + 802.11n WiFi USB ~ NTD$2000) 而且輕巧; (2) 相對方便的 GPIO (含  I2C, SPI, UART) 控制; (3) 相當完整的 python 開發環境; (4) Lots of community resources for hardware and software.

我的 RPi 配備包含 : RPi 主板及壓克力殼, USB mouse/keyboard (later change to wireless mouse/keyboard),  WiFi USB dongle, HDMI monitor.

image

 

設定程序

Step 0:  Create SD OS image.  I used Raspbian from the official website.   Need to use win32diskimager to write the OS image into SD card.  Connect Ethernet cable; and keyboard/mouse via USB.  Not use WiFi at this time since there is NOT enough USB ports.

Step 1:  In the raspi-config remember to set: 

1. 設定 SD 卡 expand_root to use the entire SD space

2. 設定 keyboard layout from UK to US.  If not working, edit the /etc/default/keyboard directly.

   sudo nano /etc/default/keyboard

   edit XKBLAYOUT=’gb’  --> ‘us’

  $sudo reboot  

3. boot_hehaviour: 開機直接進入 X windown 畫面

3. It can always run “sudo raspi-config” later to change the setting

 

Step2:  Download pietty for ssh login.  pietty and putty are similar ssh.

 

Step3:  Do your house cleaning first

I’m loading software via the Linux apt-get utility and you need to make sure its database is up to date. First thing to do is to update apt-get’s local database with server’s pkglist’s files.  Then checks for outdated packages in the system and automatically upgrades them.  Execute the following commands:

$sudo apt-get update

$sudo apt-get upgrade

 

Step 4:  The default shell of RPi is bash.  I install csh and tcsh for convenience.

$sudo apt-get install csh

$sudo apt-get install tcsh

 

Step5: Setup vncserver.  It is important to have vnc since later we don’t need to use USB for keyboard and mouse.

1. $ sudo apt-get install tightvncserver

2. copy the script from  PenguineMentor

3. use pietty to login and do $vncpasswd to setup user “pi” password

4. change the screen size to proper fonts.

5. In pc, use vncviewer to login

Step5a:  setup microsoft remote desktop

$sudo apt-get install xrdp

Step6:  Setup hostname so that Windows can access RPi and RPi can access Windows using name instead of IP address

1. Windows access RPi:  use samba protocol

2. RPi access Windows: use winbind protocol (obsolete but still working)

The limitation is that Windows and RPi need to be at the same sub-net.  If not, need to use DNS and other advance techniques.

$ sudo apt-get install samba

$ sudo apt-get install samba-common-bin (!! important for samba setup)

$ sudo apt-get install winbind

in /etc/samba/smb.conf, edit the workgroup variable to match your home network workgroup (as defined in windows):  workgroup = MSWORK

in /etc/nsswitch.conf, change the following line:

hosts:         files dns   -->   hosts:         files dns wins

Need to set samba to start at default?

Step7:  Setup samba

Make sure install samba and samba-common-bin

edit /etc/samba/smb.conf

$ sudo testparm –s  (to check the smb.conf)

$ sudo smbpasswd –a pi  (to add password for pi)

$ sudo service samba restart

Reference: http://simonthepiman.com/how_to_setup_windows_file_server.php

Step8:  Setup WiFi USB

  1. Connect your USB Wi-Fi Adapter to the Pi.
  2. Open the WiFi Config application on the desktop.
  3. Select your adapter from the drop down list.
  4. Sign into your home network.

Step9:  Install emacs

$ sudo apt-get install emacs

$ sudo apt-get upgrade

Step10:  Install python

reference http://jeffskinnerbox.wordpress.com/linux-python-packages-for-my-raspberry-pi/

$sudo apt-get install python  (already in PRi)
$sudo apt-get install python-dev
$sudo apt-get install libjpeg-dev (useless?)
$sudo apt-get install libfreetype6-dev (already in PRi)
$sudo apt-get install python-setuptools (easy_install for python)
$sudo apt-get install python-pip (pip for python)

$sudo easy_install -U distribute  (get latest update of python)

Then RPi related GPIO and serial port

$sudo pip install RPi.GPIO (already installed in RPi)
$sudo pip install pySerial
$sudo pip install nose  (python test framework)

$sudo pip install cmd2  (cmd is a Python Standard Library module for constructing command-prompt applications.)

Then install matplotlib, scipy, and numpy for scientific computing

$sudo apt-get install python-matplotlib
$sudo apt-get install python-mpltoolkits.basemap
$sudo apt-get install python-numpy
$sudo apt-get install python-scipy
$sudo apt-get install python-pandas

Then install qt4 for GUI

$sudo apt-get install python-qt4

Then install ipython and notebook

$sudo apt-get install ipython ipython-doc ipython-notebook ipython-quconsole

Step10:  Install git

$ sudo apt-get install git

Step 11: Install Chromium

$sudo apt-get install chromimum

2012年10月23日 星期二

Dipole/Monopole 雙極單極天線的一些特性

天線要能有效率的 radiate, 最好有 common mode signal oscillation, 因為對應的 ground 在無限遠處。  Differential signal oscillation to the first order 會互相抵消,效率會比 common mode signal oscillation 差。  因為 charge 永遠是電中性,只有 differential charge oscillation (如dipole oscillation), 不會有 common mode charge oscillation.   Current 則可以有 differential current oscillation (如 small loop) 以及 common mode current oscillation.   

Very short dipole antenna 只有 differential charge/voltage oscillation,和 small loop antenna 只有 differential current oscillation 恰好是對偶性 (duality),兩者的效率都很差。如 Table 1.

image

Table 1: Short dipole antenna and small loop antenna. (Dl = l/4 for half wavelength antenna)

Dipole antenna 利用 dipole 長度增加而產生 common mode current radiation, 如下圖 2.  Dipole antenna 最大效率的 common mode radiation 是在長度為 l/2。

長度再更長反而 current 會互相抵消,效率並不會增加。但 radiation pattern 會更集中在一些角度,又稱之為"開花" ,實務上並不常使用。理想的 dipole antenna 是 l/2, 即半波長雙極天線,遠場如下,比 short dipole antenna 公式算出略高,主要是 current distribution 為 sine instead of triangular。或者更常見的,用 l/4 的 monopole 加上 ground plane mirror. 

fields from a half-wave dipole antenna

Table 2: Half wavelength dipole antenna.

幾個需要注意的點:

1. monopole 必需是垂直於 ground plane, 而非水平。水平反而會讓 common mode current 變為 differential current, 而抵消 radiation, 變成 small loop antenna.  這也是 RFID 或 Etag 必需要遠離大片金屬的原因,因為 RFID 或 Etag 都是水平貼於表面。

2. 電路產生的 EMI 多半為 common mode noise, 很容易 couple 到 dipole antenna.  最好的解決方法是在 noise source 上做 common mode choke 或 shielding. 

3. Dipole antenna 的 differential feed is easy to understand, but how coax feed? or single-ended signal feed?

 

 

chart!

 

 

如何縮短天線大小

Dipole/monopole 雖然簡單有效,實務上常常遇到的問題是如何縮小天線的尺寸,可以 fit 到機構內。特別對於 sub 1GHz 的 mobile 或 portable 應用 (UHF, GSM-900MHz, ISM-900MHz, LTE-700MHz), 如果有一或兩根突出的天線,不但不美觀,也不安全。如果能 embed 到機構中,更符合美觀和 MIMO 的應用。常用縮短天線的方法:

1. Use monopole instead of dipole: 50% length reduction, refere the previous article for the trade-off between dipole and monopole antenna.

2. Use Meander line structure:   The meander-line antenna (MLA) can be in a l/2 dipole or l/4 ground plane format.  The idea is to fold the conductors back and forth to make the overall antenna shorter, which is shown in the following figure.  It is a smaller area, but the radiation resistance, efficiency and bandwidth decease.  The parameters of meander shape will affect the antenna performance parameter.  More details about MLA is in the next section.

image            image

 

Meander Line Antenna (MLA)

Meander line antenna is one type of the micro strip antenna. The meander line antenna was proposed by Rashed and Tai for reduce the resonant length. Meandering the patch increases the path over which the surface current flows and that eventually results in lowering of the resonant frequency than the straight wire antenna of same dimensions.

The electrical small antenna defines as the largest dimension of the antenna is no more than one-tenth of a wavelength [5]. Meander antenna is electrically small antenna .The design of meander line antenna is a set of horizontal and vertical lines. Combination of horizontal and vertical lines forms turns. Number of turns increases efficiency increases. In case of meander line if meander spacing is increase resonant frequency decreases. At the same meander separation increase resonant frequency decreases [6]. A meander antenna is an extension of the basic folded antenna and frequencies much lower than resonances of a single element antenna of equal length.
Radiation efficiency of meander line antenna is good as compare to conventional half and quarter wavelength antennas. Antenna size reduction factor β depends primarily on the number of meander elements per wavelength and spacing of element widths of the rectangular loops [7].

image

 

 

 

Meander line structure to shorten the size; but trade with efficiency

Live antenna vs. dead antenna

 

Injection power = Ohmic loss + non-radiation mode + radiating mode

where is the power of non-radiating mode?  loss again?

why long wavelength has better power in Friss formula

 

How to deal with Antenna?  Any simplified Maxwell equation?

Meander line antenna (MLA) equivalent circuit.

追蹤者