# Substitute Lecture Mike O'Neil

# 7.3 Constrained Optimization

**Recap** [Quadratic form] Assume A symmetric.

**Examples**

x^2+y^2 (PD), x^2-y^2 (ID), x^2 (PSD), -x^2-0.5y^2 (ND)

2x^2+2xy+2y^2 eigenvectors: (1,1)/\sqrt2, (1,-1)/\sqrt2 eigenvalues 3,1

**Recap** [Change of Variables] to principal axes

Let A be an n\times n symmetric matrix. Then there exists an orthogonal change of variables x=Py that transforms x^TAx into y^TDy with no cross-product terms.

**Example** What values does x^2-2y^2 take on on the unit circle?

Terminology: Let m=min\{ x^TAx:\|x\|=1\}, M=\max\{ x^TAx:\|x\|=1\}.

**Theorem 6** Let A be a symmetric matrix with m and M defined as above.

Then M is the greatest eigenvalue and m is the least eigenvalue of A (as a real number, not in magnitude).

x^TAx=M \Leftarrow x \text{EV corresponding to M}

x^TAx=m \Leftarrow x \text{EV corresponding to m}

**Example**Maximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1

**Proof** Diagonalize A=PDP^{-1}, x=Py. P^TP=I. \|x\|=\|y\|. Then x^TAx=y^TDy\le \lambda_1 y^Ty=\lambda_1 x^Tx=\lambda_1. Pick x=Pe_1, achieved. Similarly for m.

**Theorem 8** Let A be a symmetric n\times n matrix with D=\mathrm{diag}(\lambda_1,\dots,\lambda_n) in order, with P=(u_1,\dots,u_n) and P^TP=I such that A=PDP^{-1}.

Then the maximum value subject to the constraints x^Tx=1, x^Tu_1=0,\dots,x^Tu_{k-1}=0 is the eigenvalue \lambda_k, attained for x=u_k.

**Example**Maximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1 subject to x+y=0.

**Example** [Parks and Rec.] Suppose cost of building x hundred miles of road and improving y acres of parks is

(Contrived--why? Negative road building?)

Utility q(x,y)=xy, indifference curve.

Renormalize to unit circle.

# 7.4 Singular value decomposition

Seems we can understand *symmetric* linear maps in terms of their effect on circles along certain axes. What about general (even rectangular) linear maps?

**Example** Consider

Unit circle to ellipse: (18,6) major (3,-9) minor

What can we say about \|Ax\|? Leads naturally to A^TA, which is symmetric.

Eigenvalues 360, 90, 0.

Eigenvectors v_1=(1/3,2/3,2/3), v_2=(-2/3,-1/3,2/3), v_3=(2/3,-2/3,1/3)

Maximum value of \|Ax\|^2 attained for x=v_1. What value of Ax attains the point? What's the maximum value of \|Ax\|?

## Singular Values

Let A be an m\times n matrix.

**Definition** [Singular value] Square roots of the eigenvalues of A^TA are called *singular values* of A, written \sigma_i=\sqrt{\lambda_i}. Usually ordered \sigma_1\ge \cdots \ge \sigma_n.

Size A^TA? Why can I take square root? (\|Av_i\| nonnegative)

**Example**

- Singular values from examples above?
When is the second singular value attained? (orthogonal to v_1)

Notice: Av_1\perp Av_2!

**Theorem** Suppose \{v_1,\dots,v_n\} is an orthonormal basis of \mathbb R^n consisting of eigenvectors of A^T A, arranged so that the corresponding eigenvalues of A^TA satisfy \lambda_1\ge \cdots\ge \lambda_n. Also suppose that A has r nonzero eigenvalues.

Then:

\{Av_1,\dots,Av_r\} is an OB for \mathrm{Col} A.

\mathrm{rank} A=r.

**Proof**

Calculate (Av_i)^T(Av_j).

Av_i\ne 0\Leftrightarrow 1\le i\le r.

So Av_1,\dots,Av_n are linearly independent.

They are in \mathrm{Col} A--obviously.

They span \mathrm{Col} A because pick y\in\mathrm{Col} A, expand x in full basis.

Thus \mathrm{rank}A =r.

**Remark** [Numerics] Rank is hard to get numerically. Row echelon form accumulates roundoff. SVD to the rescue! (Also: effective rank) (But: in general not a simple problem.)

## The Singular Value Decomposition

The previous theorem made it possible to capture the column space of A by applying A to the eigenvectors of A^TA. So: natural to project into eigenspace (of A^TA), apply stretches, land in column space.

**Theorem** [Singular Value Decomposition] Let A be an m\times n matrix with rank r.

Then there exists an m\times n matrix \Sigma which is zero outside an r\times r non-zero diagonal subblock D, where D=\mathrm{diag}(\sigma_1, \dots,\sigma_r).

There also exists an m\times m orthogonal matrix U

There also exists an n\times n orthogonal matrix V

such that

A=U\Sigma V^T

**Remark** [Uniqueness] Anything that looks like this is called an SVD. \Sigma unique, U, V not.

**Definition** [Singular vectors]

Columns of U: left singular vectors

Columns of V: right singular vectors

*============END OF CLASS=============*

**Proof**

Let u_i=Av_i normalized (i=1,\cdots,r)

- Extend to ONB.
Create orthogonal matrices U and V, sizes match.

Check that U\Sigma=AV. (V invertible!)

**Example** Create the SVD from the example.

**Example 2** Find an SVD of \begin{pmatrix} 1 & 1\\ 0 & 1\\ -1 & 1\end{pmatrix}. A^TA=diag(2,3).

**Example 3** Find an SVD of \begin{pmatrix} 1 & -1\\-2 & 2 \\ 2 & -2\end{pmatrix}. A^TA=[9,-9;-9;9]. Eigenvalues are 18,0.

**Remark** [Numerics] Using A^TA amplifies errors. Not used in real-world matrix code.

## Applications

**Definition** [Condition number] Digit loss through base-10 cancellation. \sigma_1/\sigma_n

**Remark** [Connections of fundamental spaces] Start with:

Col A is spanned by u_i (Thm 9)

(Col A)^\perp=Nul A^T is spanned by u_i (i>r) (Thm 3, 6.1)

Nul A is spanned by v_i (i>r)

Row A=(Nul A)^\perp is spanned by v_i

Then:

A maps Row A to Col A=Row A^T

- A maps Nul A to 0
Nul A^T is left over

**Theorem** [Invertible Matrix] Let A be an n\times n matrix. All of the following are equivalent:

(\mathrm{Col} A)^\perp=\{0\}

(\mathrm{Nul} A)^\perp=\mathbb{R}^n

Row A=\mathbb{R}^n.

A has n nonzero singular values.

**Definition** [Reduced SVD] (can chop zeros)

Question: How to invert the SVD?

**Definition** [Moore-Penrose Pseudoinverse]

**Remark** [Least squares solutions] Consider an MP \hat x solution. Then A\hat x=U_rU_r^Tb is the orthogonal projection onto the column space of A, and thus a least-squares solution (Thm 10, Sec. 6.3).