Additional Notes and Solution Manual For: Matrix Computations: Third Edition by Gene H. Golub and Charles F. Van Loan ∗
John L. Weatherwax February 9, 2007
Chapter 2 (Matrix Analysis): Basic Ideas from Linear Algebra P 2.1.1 (existence of a p rank factorization of A) Assume A is mxn and of rank r . The using elementary elimination matrices we can reduce A to its row reduced echelon form R, given by E 1 A = R or A = E 1 1 R = E 2 R. In thi this reduction E 2 is m xm and R is m xn. Because R has n − r zero rows we can block decompose it as follows R ˆr n R = 0m r n −
×
− ×
ˆ is of wheree we hav wher havee listed the dimen dimensions sions of the the block matrice matricess next to them. Now R rank r . In addition, block decomposing E 2 as ˆm r E ˜m E 2 = E
×
m−r
×
.
Now since E 2 is of rank m the first r columns of E 2 is a matrix of rank r. Th This is b bloc lock k decomposition gives for A the expression A
ˆ = ˆ ˆ = ˆ ˜ R
E E
0
E R
ˆ of ˆ of size r xn each of rank r , thus This gives the deco decompositi mposition on of A into E of size mxr and R proving the decomposition. ∗
[email protected]
1
P 2.1.2 (the matrix product rule) This result is basically a consequence of the definition of matrix multiplication. For example the ij -th element of the produce A(α)B (α) is given by r
aik (α)bkj (α)
k =1
where aik (α) is the ik-th element of A and bkj (α) is the kj -th element of B . From th this is we then have that d dα
r
r
aik (α)bkj (α) =
k =1
k =1
daik (α) bkj (α) + dα
r
aik (α)
k=1
dbkj (α) dα
Which we recognize as the ij -th element of dA B plus the ij -th element of A dB proving the dα dα desired theorem.
P 2.1.3 (matrix inverse differentiation) To show this consider the derivative of the expression A(α)A(α)
with respect to α.
−
1
d (A(α)A(α) dα
= I 1
−
)=0
Using the result of P 2.1.3 we have that dA(α) A(α) dα
dA(α) 1 + A(α) =0 dα −
1
−
−1
α) we hav havee that and solving for dA(dα
dA(α) 1 dα = −A(α) −
1 dA(α)
1
−
−
dα A(α)
the desired result.
P 2.1.4 (the gradient of the matrix inner product) We desire the gradient of the function φ(x) =
1 T x Ax − xT b . 2
1
−
The i-th component of the gradient is given by ∂φ ∂ = ∂x i ∂x i
1 2
T
T
x Ax − x b =
1 T 1 ei Ax + xT Aei − eiT b 2 2
where ei is the i-th elementary basis function for Rn , i.e. it has a 1 in the i-th position and zeros everywhere else. Now since T T T T T (eT i Ax) = x A ei = e i Ax ,
the above becomes T ∂φ = 1 eT AT x − eT Ax + 1 + 1 eT i b = e i ∂x i 2 i 2 i
A + AT )x − b .
1( 2
Since multiplying by eT i on the left selects the i-th row from the expression to its right we see that the full gradient expression is given by ∇φ =
1 (A + AT )x − b , 2
as requested requested in the text text.. Note that this expr expressi ession on can also b bee pro prove ved d easil easily y by writin writingg eac each h term in components.
P 2.1.5 (solutions to rank one updates of A) If x solves (A + uv T )x = b , by the Sherman-Morrison-Woodberry formula (equation 2.1.4 in the book), with U and V vectors with n components we have x given by x = (A
1
−
1
−
−A
1
u(I + + v T A
−
u)
1 T
−
v A
1
−
)b
= A 1 b − A 1 u(I + + v T A 1 u) 1 v T A 1 b . −
−
−
−
−
Since u and v are vectors the expression v T A 1 u is a scalar and and the I is also a scalar namely the number 1. Multiplying the above by A on the left the linear system that x must satify −
Ax = b − u(1 + v T A
1
−
u)
1 T
−
1
−
v A
b.
In this expression, both v T A 1 u and v T A 1 b are scalars, thus by factoring out the only vector u the above is equivalent to −
−
Ax = b −
v T A 1 b (1 + v T A 1 u) −
−
u.
Therfore x is the solution to a modified system given by Ax = b + αu with α given by α =
−
v T A 1 b . 1 + v T A 1 u −
−