Solving Least-Squares Problems

import numpy as np
import numpy.linalg as la
import scipy.linalg as spla

m = 6
n = 4

A = np.random.randn(m, n)
b = np.random.randn(m)

Let's try solving that as a linear system using la.solve:

la.solve(A, b)

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-26-32000b03e56a> in <module>()
----> 1 la.solve(A, b)

/usr/lib/python3/dist-packages/numpy/linalg/linalg.py in solve(a, b)
    353     a, _ = _makearray(a)
    354     _assertRankAtLeast2(a)
--> 355     _assertNdSquareness(a)
    356     b, wrap = _makearray(b)
    357     t, result_t = _commonType(a, b)

/usr/lib/python3/dist-packages/numpy/linalg/linalg.py in _assertNdSquareness(*arrays)
    210     for a in arrays:
    211         if max(a.shape[-2:]) != min(a.shape[-2:]):
--> 212             raise LinAlgError('Last 2 dimensions of the array must be square')
    213 
    214 def _assertFinite(*arrays):

LinAlgError: Last 2 dimensions of the array must be square

OK, let's do QR-based least-squares then.

Q, R = la.qr(A)

What did we get? Full QR or reduced QR?

Q.shape

(6, 4)

R.shape

(4, 4)

Is that a problem?

Do we really need the bottom part of $R$? (A bunch of zeros)
Do we really need the far right part of $Q$? (=the bottom part of $Q^T$)

OK, so find the minimizing $x$:

x = spla.solve_triangular(R, Q.T.dot(b), lower=False)

We predicted that $\|Ax-b\|_2$ would be the same as $\|Rx-Q^Tb\|_2$:

la.norm(A.dot(x)-b, 2)

1.4448079009090737

la.norm(R.dot(x) - Q.T.dot(b))

1.5700924586837752e-16

Heh--reduced QR left out the right half of Q. Let's try again with complete QR:

Q2, R2 = la.qr(A, mode="complete")

x2 = spla.solve_triangular(R[:n], Q.T[:n].dot(b), lower=False)

la.norm(A.dot(x)-b, 2)

1.4448079009090737

la.norm(R2.dot(x2) - Q2.T.dot(b))

1.444807900909074

Did we get the same x both times?

x - x2

array([ 0.,  0.,  0.,  0.])

Finally, let's compare against the normal equations:

x3 = la.solve(A.T.dot(A), A.T.dot(b))

x3 - x

array([  4.99600361e-16,  -1.66533454e-16,   7.77156117e-16,
        -1.11022302e-16])