339x Filetype PDF File size 2.78 MB Source: www.cs.cmu.edu
Matrix differential calculus
10-725 Optimization
Geoff Gordon
Ryan Tibshirani
Review
• Matrix differentials: sol’n to matrix calculus pain
‣ compact way of writing Taylor expansions, or …
‣ definition:
‣ df = a(x; dx) [+ r(dx)]
‣ a(x; .) linear in 2nd arg
‣ r(dx)/||dx|| → 0 as dx → 0
• d(…) is linear: passes thru +, scalar *
• Generalizes Jacobian, Hessian, gradient, velocity
Geoff Gordon—10-725 Optimization—Fall 2012 2
Review
• Chain rule
• Product rule
• Bilinear functions: cross product, Kronecker,
Frobenius, Hadamard, Khatri-Rao, …
• Identities
‣ rules for working with ᪻, tr()
‣ trace rotation
• Identification theorems
Geoff Gordon—10-725 Optimization—Fall 2012 3
Finding a maximum
or minimum, or saddle point
ID for df(x) scalar x vector x matrix X
T T
scalar f df = a dx df = a dx df = tr(A dX)
vector f df = a dx2 df = Adx
1.5
matrix F dF = A dx
1
0.5
0
ï0.5
ï1
Geoff Gordon—10-725 Optimization—Fall 2012 ï3 ï2 ï1 0 1 2 3 4
no reviews yet
Please Login to review.