Substitute Lecture Mike O'Neil

7.3 Constrained Optimization

Recap [Quadratic form] Assume A symmetric.

x^TAx

Examples

Recap [Change of Variables] to principal axes

Let A be an n\times n symmetric matrix. Then there exists an orthogonal change of variables x=Py that transforms x^TAx into y^TDy with no cross-product terms.

Example What values does x^2-2y^2 take on on the unit circle?

Terminology: Let m=min\{ x^TAx:\|x\|=1\}, M=\max\{ x^TAx:\|x\|=1\}.

Theorem 6 Let A be a symmetric matrix with m and M defined as above.

ExampleMaximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1

Proof Diagonalize A=PDP^{-1}, x=Py. P^TP=I. \|x\|=\|y\|. Then x^TAx=y^TDy\le \lambda_1 y^Ty=\lambda_1 x^Tx=\lambda_1. Pick x=Pe_1, achieved. Similarly for m.

Theorem 8 Let A be a symmetric n\times n matrix with D=\mathrm{diag}(\lambda_1,\dots,\lambda_n) in order, with P=(u_1,\dots,u_n) and P^TP=I such that A=PDP^{-1}.

Then the maximum value subject to the constraints x^Tx=1, x^Tu_1=0,\dots,x^Tu_{k-1}=0 is the eigenvalue \lambda_k, attained for x=u_k.

ExampleMaximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1 subject to x+y=0.

Example [Parks and Rec.] Suppose cost of building x hundred miles of road and improving y acres of parks is

4x^2+9y^2\le36

(Contrived--why? Negative road building?)

Utility q(x,y)=xy, indifference curve.

Renormalize to unit circle.

7.4 Singular value decomposition

Seems we can understand symmetric linear maps in terms of their effect on circles along certain axes. What about general (even rectangular) linear maps?

Example Consider

A=\begin{pmatrix} 4 & 11 & 14\\ 8 & 7 & -2 \end{pmatrix}

Unit circle to ellipse: (18,6) major (3,-9) minor

What can we say about \|Ax\|? Leads naturally to A^TA, which is symmetric.

A^TA=\begin{pmatrix} 80&100&40\\ 100&170&140\\ 40&140&200\end{pmatrix}

Eigenvalues 360, 90, 0.

Eigenvectors v_1=(1/3,2/3,2/3), v_2=(-2/3,-1/3,2/3), v_3=(2/3,-2/3,1/3)

Maximum value of \|Ax\|^2 attained for x=v_1. What value of Ax attains the point? What's the maximum value of \|Ax\|?

Singular Values

Let A be an m\times n matrix.

Definition [Singular value] Square roots of the eigenvalues of A^TA are called singular values of A, written \sigma_i=\sqrt{\lambda_i}. Usually ordered \sigma_1\ge \cdots \ge \sigma_n.

Size A^TA? Why can I take square root? (\|Av_i\| nonnegative)

Example

Notice: Av_1\perp Av_2!

Theorem Suppose \{v_1,\dots,v_n\} is an orthonormal basis of \mathbb R^n consisting of eigenvectors of A^T A, arranged so that the corresponding eigenvalues of A^TA satisfy \lambda_1\ge \cdots\ge \lambda_n. Also suppose that A has r nonzero eigenvalues.

Then:

Proof

Remark [Numerics] Rank is hard to get numerically. Row echelon form accumulates roundoff. SVD to the rescue! (Also: effective rank) (But: in general not a simple problem.)

The Singular Value Decomposition

The previous theorem made it possible to capture the column space of A by applying A to the eigenvectors of A^TA. So: natural to project into eigenspace (of A^TA), apply stretches, land in column space.

Theorem [Singular Value Decomposition] Let A be an m\times n matrix with rank r.

such that

A=U\Sigma V^T

Remark [Uniqueness] Anything that looks like this is called an SVD. \Sigma unique, U, V not.

Definition [Singular vectors]

============END OF CLASS=============

Proof

Example Create the SVD from the example.

Example 2 Find an SVD of \begin{pmatrix} 1 & 1\\ 0 & 1\\ -1 & 1\end{pmatrix}. A^TA=diag(2,3).

Example 3 Find an SVD of \begin{pmatrix} 1 & -1\\-2 & 2 \\ 2 & -2\end{pmatrix}. A^TA=[9,-9;-9;9]. Eigenvalues are 18,0.

Remark [Numerics] Using A^TA amplifies errors. Not used in real-world matrix code.

Applications

Definition [Condition number] Digit loss through base-10 cancellation. \sigma_1/\sigma_n

Remark [Connections of fundamental spaces] Start with:

Then:

Theorem [Invertible Matrix] Let A be an n\times n matrix. All of the following are equivalent:

Definition [Reduced SVD] (can chop zeros)

Question: How to invert the SVD?

Definition [Moore-Penrose Pseudoinverse]

Remark [Least squares solutions] Consider an MP \hat x solution. Then A\hat x=U_rU_r^Tb is the orthogonal projection onto the column space of A, and thus a least-squares solution (Thm 10, Sec. 6.3).