Mae 546 Lecture 16
Short Description
Descripción: ghgfhf...
Description
Least-Squares Estimation
Robert Stengel
Optimal Control and Estimation, MAE 546, Princeton University, 2013" • Estimating unknown constants from redundant measurements"
Perfect Measurement of a Constant Vector" • Given " – Measurements, y, of a constant vector, x"
• Estimate x" • Assume that output, y, is a perfect measurement and H is invertible"
– Least-squares" – Weighted least-squares"
• Recursive weighted leastsquares estimator"
y=Hx
• Estimate is based on inverse transformation"
xˆ = H −1 y
Copyright 2013 by Robert Stengel. All rights reserved. For educational use only.! http://www.princeton.edu/~stengel/MAE3546.html! http://www.princeton.edu/~stengel/OptConEst.html!
Imperfect Measurement of a Constant Vector"
Cost Function for LeastSquares Estimate"
• Given " –
• Measurement-error residual"
Noisy measurements, z, of a constant vector, x"
• Effects of error can be reduced if measurement is redundant" • Noise-free output, y"
y=Hx
– y: (k x 1) output vector" – H: (k x n) output matrix, k > n" – x : (n x 1) vector to be estimated"
• Measurement of output with error, z"
z= y+n=Hx+n
– z: (k x 1) measurement vector" – n : (k x 1) error vector"
– y: (n x 1) output vector" – H: (n x n) output matrix" – x : (n x 1) vector to be estimated"
ε = z − H xˆ = z − yˆ
dim(ε) = ( k ×1)
• Squared measurement error = cost function, J" Quadratic norm! 1 T 1 T ε ε = ( z − H xˆ ) ( z − H xˆ ) 2 2 1 T = z z − xˆ T HT z − z T H xˆ + xˆ T HT H xˆ 2
J=
(
)
• What is the control parameter?" The estimate of x!
xˆ
dim ( xˆ ) = ( n × 1)
Static Minimization Provides Least-Squares Estimate" Error cost function"
J=
(
1 T z z − xˆ T HT z − z T H xˆ + xˆ T HT H xˆ 2
)
Static Minimization Provides Least-Squares Estimate" • Estimate is obtained using left pseudo-inverse matrix"
(
)(
xˆ T HT H HT H
Necessary condition" T T ∂J 1 = 0 = ⎡ 0 − ( HT z ) − z T H + ( HT H xˆ ) + xˆ T HT H ⎤ ⎦ ∂ xˆ 2⎣
)
−1
(
= xˆ T = z T H HT H
)
−1
(row)
or
(
xˆ = HT H
)
−1
HT z (column)
xˆ T HT H = z T H
Example: Average Weight of a Pail of Jelly Beans"
Average Weight of the Jelly Beans"
• Measurements are equally uncertain"
Optimal estimate" −1 ⎡ z1 ⎤ ⎛ ⎡ 1 ⎤⎞ ⎢ ⎥ ⎢ ⎥⎟ ⎜ 1 ⎥⎟ ⎡ ⎢ z2 ⎥ ⎤ xˆ = ⎜ ⎡⎣ 1 1 ... 1 ⎤⎦ ⎢ 1 1 ... 1 ⎣ ⎦⎢ ⎢ ... ⎥⎟ ... ⎥ ⎜ ⎢ ⎥ ⎜⎝ ⎢ 1 ⎥⎟⎠ ⎣ ⎦ ⎢⎣ zk ⎥⎦
zi = x + ni , i = 1 to k • Express measurements as"
z = Hx + n • Output matrix"
⎡ ⎢ H=⎢ ⎢ ⎢ ⎣
1 1 ... 1
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
• Optimal estimate"
(
xˆ = H H T
)
−1
= (k)
−1
( z1 + z2 + ... + zk )
Simple average" T
H z
xˆ =
k
1 ∑ zi k i=1
[sample mean value]
Least-Squares Applications" Least-Squares Linear Fit to Noisy Data"
Error “Cost” Function!
• Measurement vector" • Find trend line in noisy data"
⎡ ⎢ ⎢ ⎢ ⎢ ⎢⎣
y = a0 + a1 x
z = ( a0 + a1 x ) + n
J=
– Higher-degree curve-fitting" – Multivariate estimation "
Least-Squares Image Processing" Original Lenna !
Degraded Image!
Restored Image (Adaptively Regularized Constrained Total Least Squares)!
Chen, Chen, and Zhou, IEEE Trans. Image Proc., 2000"
⎡ z1 ⎤ ⎢ ( a0 + a1 x1 ) ⎥ z2 ⎥ ⎢ ( a0 + a1 x2 ) =⎢ ⎥ ⎢ ⎥ zn ⎥ ⎢ ( a0 + a1 xn ) ⎦ ⎢⎣
⎤ ⎡ ⎥ ⎢ ⎥ ⎢ ⎥+⎢ ⎥ ⎢ ⎥ ⎢ ⎦⎥ ⎣
⎡ 1
x1 ⎤
⎢ ⎥ n1 ⎤ 1 x2 ⎥ ⎡ a0 ⎤ ⎢ ⎥+n ⎥ z = ⎢⎢ ⎥ ⎢⎣ a1 ⎥⎦ n2 ⎥ ⎢ ⎥ ⎢⎣ 1 xn ⎥⎦ ⎥ ⎥ Ha + n nn ⎥ ⎦
• Least-squares estimate of trend line" • Estimate ignores statistics of the error"
• Error cost function"
• More generally, least-squares estimation is used for"
zi = ( a0 + a1 x ) + ni
1 ( z − H aˆ )T ( z − H aˆ ) 2
⎡ aˆ0 ⎤ ⎥ = HT H aˆ = ⎢ ⎢⎣ aˆ1 ⎥⎦
(
)
−1
HT z
yˆ = aˆ0 + aˆ1 x
Measurements of Differing Quality" • Suppose some elements of the measurement, z, are more uncertain than others"
z = Hx + n • Give the more uncertain measurements less weight in arriving at the minimum-cost estimate" • Let S = measure of uncertainty; then express error cost in terms of S–1" 1 J=
2
ε T S −1ε
Error Cost and Necessary Condition for a Minimum"
Weighted Least-Squares Estimate of a Constant Vector"
Error cost function, J"
1 1 T J = ε T S −1ε = ( z − H xˆ ) S −1 ( z − H xˆ ) 2 2 1 T −1 = z S z − xˆ T HT S −1z − z T S −1H xˆ + xˆ T HT S −1H xˆ 2
(
Necessary condition for a minimum"
⎡⎣ xˆ T HT S −1H − z T S −1H ⎤⎦ = 0 xˆ T HT S −1H = z T S −1H
)
Necessary condition for a minimum"
Weighted left pseudo-inverse provides the solution"
∂J =0 ∂ xˆ =
(
xˆ = HT S −1H
T T 1⎡ 0 − ( HT S −1z ) − z T S −1H + ( HT S −1H xˆ ) + xˆ T HT S −1H ⎤ ⎣ ⎦ 2
The Return of the Jelly Beans" • Error-weighting matrix" ⎡ ⎢ −1 S A = ⎢⎢ ⎢ ⎢⎣
a11
0
0
a22
... 0
... 0
⎛ ⎡ a11 ⎢ ⎜ 0 xˆ = ⎜ ⎡⎣ 1 1 ... 1 ⎤⎦ ⎢⎢ ⎜ ... ⎢ ⎜ ⎜⎝ ⎢⎣ 0
0 a22 ... 0
0 ⎤⎡ ⎥ ... 0 ⎥ ⎢ ⎢ ... ... ⎥ ⎢ ⎥⎢ ... akk ⎥ ⎣ ⎦ ...
(
−1
xˆ = H S H T
)
−1
T
a) Normalize the cost function according to expected measurement error, SA!
−1
H S z
J= −1
1 1 ... 1
⎡ a11 ⎤⎞ ⎢ ⎥⎟ 0 ⎥⎟ ⎡ 1 1 ... 1 ⎤ ⎢ ⎣ ⎦ ⎢ ... ⎥⎟ ⎢ ⎥⎟⎟ ⎦⎠ ⎢⎣ 0 k
xˆ =
∑a z i =1 k
ii i
∑a i =1
ii
HT S −1 z
How to Chose the Error Weighting Matrix"
• Optimal estimate of average jelly bean weight"
0 ⎤ ⎥ ... 0 ⎥ ... ... ⎥ ⎥ ... akk ⎥ ⎦ ...
)
−1
0 ⎤⎡ ⎥⎢ 0 ⎥⎢ ... ... ⎥ ⎢ ⎥⎢ ... akk ⎥ ⎢ ⎦⎣
0
...
a22
...
... 0
z1 ⎤ ⎥ z2 ⎥ ... ⎥ ⎥ zk ⎥ ⎦
1 T −1 1 1 T ε S A ε = ( z − y ) S −1 ( z − H x )T S−1A ( z − H x ) A (z − y) = 2 2 2
b) Normalize the cost function according to expected measurement residual, SB! 1 1 J = ε T S −1 ( z − H xˆ )T S−1B ( z − H xˆ ) Bε = 2 2
Measurement Residual Covariance, SB"
Measurement Error Covariance, SA"
Expected value of outer product of measurement residual vector"
Expected value of outer product of measurement error vector"
S B = E ⎡⎣ εε T ⎤⎦
T S A = E ⎡⎣( z − y ) ( z − y ) ⎤⎦
T = E ⎡⎣( z − Hxˆ ) ( z − Hxˆ ) ⎤⎦ T = E ⎡⎣( Hε + n ) ( Hε + n ) ⎤⎦
T = E ⎡⎣( z − Hx ) ( z − Hx ) ⎤⎦
(
(
)
(
= HE ⎡⎣ εε T ⎤⎦ HT + HE εnT + E nε T HT + E nnT
Recursive Least-Squares Estimation"
)
HPH + HM + M H + R
where
Requires iteration ( adaptation ) of the estimate to find SB"
R = E ⎡⎣ nnT ⎤⎦
T
= E ⎡⎣ nnT ⎤⎦ R
T
T
T P = E ⎡⎣( x − xˆ ) ( x − xˆ ) ⎤⎦ T M = E ⎡⎣( x − xˆ ) n ⎤⎦
Prior Optimal Estimate" Initial measurement set and state estimate, with S = SA = R"
! Prior unweighted and weighted least-squares estimators use batch-processing approach"
z1 = H1x + n1
! All information is gathered prior to processing" ! All information is processed at once"
xˆ 1 = ( H1T R1−1H1 ) H1T R1−1z1 −1
! Recursive approach" ! Optimal estimate has been made from prior measurement set" ! New measurement set is obtained" ! Optimal estimate is improved by incremental change (or correction) to the prior optimal estimate"
)
ε = ( z − Hxˆ )
dim ( z1 ) = dim ( n1 ) = k1 × 1 dim ( H1 ) = k1 × n
State estimate minimizes"
J1 =
dim ( R1 ) = k1 × k1
1 T −1 1 T ε1 R1 ε1 = ( z1 − H1 xˆ 1 ) R1−1 ( z1 − H1 xˆ 1 ) 2 2
Cost of Estimation Based on Both Measurement Sets"
New Measurement Set" New measurement"
Cost function incorporates estimate made after incorporating z2"
z 2 = H2 x + n2 R 2 : Second measurement error covariance
J 2 = ⎡ ( z1 − H1xˆ 2 ) ⎣⎢
T
Concatenation of old and new measurements"
⎛ z1 ⎞ z⎜ ⎟ ⎜⎝ z 2 ⎟⎠
dim ( z 2 ) = dim ( n 2 ) = k2 × 1 dim ( H 2 ) = k2 × n
dim ( R 2 ) = k2 × k2
HT2
−1
⎤ ⎡ H ⎤⎫ ⎧ ⎥ ⎢ 1 ⎥ ⎪⎬ ⎪⎨ ⎡ H1T ⎥ ⎢⎣ H 2 ⎥⎦ ⎪ ⎪ ⎣ ⎦ ⎭ ⎩
= ( H1T R1−1H1 + HT2 R −1 2 H2 )
−1
(H R T 1
⎡ R −1 0 1 HT2 ⎤ ⎢ ⎦ ⎢ 0 R −1 2 ⎣ z + HT2 R −1 2 z2 )
−1 1 1
⎛ −1 0 ⎤ ⎜ R1 ⎦⎥ ⎜ 0 R −1 2 ⎝
T
⎞ ⎡ ( z − H xˆ ) 1 1 2 ⎟⎢ ⎟⎠ ⎢ ( z 2 − H 2 xˆ 2 ) ⎣
T
Both residuals refer to xˆ 2
Simplification occurs because weighting matrix is block diagonal" ⎡ R −1 0 ⎤⎢ 1 ⎦ ⎢ 0 R −1 2 ⎣
T
ˆ = ( z1 − H1xˆ 2 ) R1−1 ( z1 − H1xˆ 2 ) + ( z 2 − H 2 xˆ 2 ) R −1 2 ( z 2 − H2x2 )
Optimal Estimate Based on Both Measurement Sets"
⎧ ⎪ xˆ 2 = ⎨ ⎡ H1T ⎣ ⎪⎩
( z 2 − H2 xˆ 2 )
Apply Matrix Inversion Lemma" Define"
P1−1
H1T R1−1H1
Matrix inversion lemma" ⎤ ⎡ z ⎤⎫ ⎥ ⎢ 1 ⎥ ⎪⎬ ⎥ ⎢⎣ z 2 ⎥⎦ ⎪ ⎦ ⎭
(H R (P T 1
H1 + HT2 R −1 = 2 H2 ) −1
−1 1
−1 1
+ HT2 R −1 = 2 H2 ) −1
P1 − P1HT2 ( H 2 P1HT2 + R 2 ) H 2 P1 −1
⎤ ⎥ ⎥ ⎦
Improved Estimate Incorporating New Measurement Set" xˆ 1 = P1H1T R1−1z1
I = A −1A = AA −1 , with A H 2 P1HT2 + R 2
New estimate is a correction to the old"
(
xˆ 2 = xˆ 1 − P1HT2 H 2 P1HT2 + R 2
(
)
−1
H 2 xˆ 1
+ P1H ⎡ I n − H 2 P1H + R 2 ⎣ T 2
T 2
)
−1
H 2 P1H ⎤ R z ⎦ T 2
−1 2 2
−1 = ⎡ I n − ( H 2 P1HT2 + R 2 ) H 2 P1HT2 ⎤ xˆ 1 ⎣ ⎦ −1 +P1HT2 ⎡ I n − ( H 2 P1HT2 + R 2 ) H 2 P1HT2 ⎤ R 2 −1z 2 ⎣ ⎦
Recursive Optimal Estimate" ! Prior estimate may be based on prior incremental estimate, and so on" ! Generalize to a recursive form, with sequential index i!
(
xˆ i = xˆ i −1 − Pi −1HTi Hi Pi −1HTi + R i xˆ i −1 − Ki ( z i − Hi xˆ i −1 ) with
(
−1 i −1
−1 i
Pi = P + H R Hi T i
)
) (z −1
i
− Hi xˆ i −1 )
dim ( x ) = n × 1; dim ( P) = n × n
−1
Simplify Optimal Estimate Incorporating New Measurement Set"
dim ( z ) = r × 1; dim ( R ) = r × r
dim ( H ) = r × n; dim ( K ) = n × r
xˆ 2 = xˆ 1 − P1HT2 ( H 2 P1HT2 + R 2 )
−1
xˆ 1 − K ( z 2 − H 2 xˆ 1 )
( z 2 − H2 xˆ 1 )
K : Estimator gain matrix
Example of Recursive Optimal Estimate" z= x+n xˆi = xˆi−1 + pi−1 ( pi−1 + 1)
−1
pi−1 ( pi−1 + 1)
ki = pi−1 ( pi−1 + 1) = −1
−1 pi = ( pi−1 + 1) = −1
( zi − xˆi−1 )
(p
i−1
index p-sub-i k-sub-i 0 11 0.5 0.5 2 0.333 0.333 3 0.25 0.25 4 0.2 0.2 5 0.167 0.167
1 −1
+ 1)
xˆ 0 = z0 xˆ1 = 0.5 xˆ0 + 0.5z1 xˆ 2 = 0.667 xˆ1 + 0.333z2 xˆ 3 = 0.75 xˆ2 + 0.25z3 xˆ 4 = 0.8 xˆ 3 + 0.2z4
H = 1; R = 1
Optimal Gain and Estimate-Error Covariance"
(
T −1 Pi = Pi−1 −1 + H i R i H i
(
)
−1
Ki = Pi −1HTi Hi Pi −1HTi + R i
)
−1
! With constant estimation error matrix, R," ! Error covariance decreases at each step" ! Estimator gain matrix, K, invariably goes to zero as number of samples increases"
Next Time: Propagation of Uncertainty in Dynamic Systems
! Why?" ! Each new sample has smaller effect on the average than the sample before!
Weighted Least Squares ( Kriging ) Estimates (Interpolation)" • Can be used with arbitrary interpolating functions"
Supplemental Material!
http://en.wikipedia.org/wiki/Kriging!
Weighted Least Squares Estimates of Particulate Concentration (PM2.5)" Delaware Sampling Sites!
Delaware Dept. of Natural Resources and Environmental Control, 2008!
Delaware Average Concentrations!
DE-NJ-PA PM2.5 Estimates!
View more...
Comments