Computing QR factorizations

To compute an LU factorization, we follow elimination rules to introduce zeros into the lower triangle of the matrix, leaving only the \(\mathbf{U}\) factor. The row operations are themselves triangular and can be combined into the \(\mathbf{L}\) factor. For the QR factorization, the game is again to introduce zeros into the lower triangle, but the rules have changed; now the row operations must be done orthogonally. Thanks to the orthogonal matrix theorem, the product of orthogonal operations will also be orthogonal, providing us with the final \(\mathbf{Q}\) of the QR.

Householder reflections

A Householder reflector is a particular type of orthogonal matrix \(\mathbf{P}\). The reflection is customized for a particular given vector \(\mathbf{z}\) so that \(\mathbf{P}\mathbf{z}\) is nonzero only in the first element. Since orthogonal matrices preserve the 2-norm, we must have

(71)\[\begin{split}\mathbf{P}\mathbf{z} = \begin{bmatrix} \pm \| \mathbf{z} \|\\0 \\ \vdots \\ 0 \end{bmatrix} = \pm \| \mathbf{z} \| \mathbf{e}_1.\end{split}\]

(Recall that \(\mathbf{e}_k\) is the \(k\)th column of the identity matrix.) We choose the positive sign for our discussion, but see qrfact and an exercise for important computational details.

Given \(\mathbf{z}\), let

(72)\[ \mathbf{v} = \| \mathbf{z} \| \mathbf{e}_1-\mathbf{z}.\]

Then the reflector \(\mathbf{P}\) is defined by

(73)\[ \mathbf{P} = \mathbf{I} - 2\frac{\mathbf{v} \mathbf{v}^T}{\mathbf{v}^T\mathbf{v}}\]

if \(\mathbf{v}\neq\boldsymbol{0}\), or \(\mathbf{P}=\mathbf{I}\) if \(\mathbf{v}=\boldsymbol{0}\). Note that \(\mathbf{v}^T\mathbf{v}\) is a scalar and can appear in a denominator, while the outer product \(\mathbf{v} \mathbf{v}^T\) is \(n\times n\). It is straightforward to show that \(\mathbf{P}\) has the following key properties.

Theorem 29 ((Householder reflector))

Let \(\mathbf{v}=\| \mathbf{z} \|\mathbf{e}_1-\mathbf{z}\) and let \(\mathbf{P}\) be given by (73). Then \(\mathbf{P}\) is symmetric and orthogonal, and \(\mathbf{P}\mathbf{z}=\| \mathbf{z} \|\mathbf{e}_1\).

Proof

The case \(\mathbf{v}=\boldsymbol{0}\) is obvious. For \(\mathbf{v}\neq\boldsymbol{0}\), the proofs of symmetry and orthogonality are left to the exercises. As for the last fact, we simply compute

(74)\[\mathbf{P}\mathbf{z} = \mathbf{z} - 2 \frac{\mathbf{v} \mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} = \mathbf{z} - 2 \frac{\mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} \mathbf{v},\]

and, since \(\mathbf{e}_1^T\mathbf{z}=z_1\),

\[\begin{split}\begin{split} \mathbf{v}^T\mathbf{v} &= \| \mathbf{z} \|^2 - 2 \| \mathbf{z} \| z_1 + \mathbf{z}^T\mathbf{z} = 2\| \mathbf{z} \|(\| \mathbf{z} \|-z_1),\\ \mathbf{v}^T\mathbf{z} &= \| \mathbf{z} \|z_1 - \mathbf{z}^T\mathbf{z} = -\| \mathbf{z} \|\bigl(\| \mathbf{z} \|-z_1\bigr), \end{split}\end{split}\]

leading finally to

\[\mathbf{P}\mathbf{z} = \mathbf{z} - 2\cdot \frac{-\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)}{2\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)} \mathbf{v} = \mathbf{z} + \mathbf{v} = \| \mathbf{z} \|\mathbf{e}_1.\]

The reason \(\mathbf{P}\) is called a reflector is sketched in Fig. 4.

../_images/hhreflect.svg

Fig. 4 Householder reflector.

The vector \(\mathbf{v}\) defines an \(n-1\) dimensional subspace \(S\) perpendicular to it. Elementary vector analysis shows that \(\mathbf{v}(\mathbf{v}^T\mathbf{z})/(\mathbf{v}^T\mathbf{v})\) is the vector projection of \(\mathbf{z}\) along \(\mathbf{v}\). By subtracting this projection from \(\mathbf{z}\), we end up in \(S\), but by subtracting twice the projection we get a reflection through \(S\). This reflection occurs when \(\mathbf{P}\) is applied to any vector, but when it is applied to \(\mathbf{z}\), the result ends up on the \(x_1\)-axis.

Factorization algorithm

The QR factorization is computed by using successive Householder reflections to introduce zeros in one column at a time. We first show the process for a small numerical example in Householder QR.

You may be wondering what happened to the \(\mathbf{Q}\) in Householder QR. Each Householder reflector is orthogonal but not full-size. We have to pad it out to represent algebraically the fact that a block of the first rows are left alone. Given a reflector \(\mathbf{P}_k\) that is of square size \(m-k+1\), we define

\[\begin{split}\mathbf{Q}_k = \begin{bmatrix} \mathbf{I}_{k-1} & \boldsymbol{0} \\ \boldsymbol{0} & \mathbf{P}_k \end{bmatrix}.\end{split}\]

It is trivial to show that \(\mathbf{Q}_k\) is also orthogonal. Then

\[ \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 \mathbf{A} = \mathbf{R}.\]

But \(\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1\) is orthogonal too, and we multiply on the left by its transpose to get \(\mathbf{A}=\mathbf{Q}\mathbf{R}\), where \(\mathbf{Q} = (\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1)^T\). We don’t even need to form these matrices explicitly. Writing

\[ \mathbf{Q}^T = \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 = \mathbf{Q}_n \Bigl( \mathbf{Q}_{n-1}\bigl(\cdots (\mathbf{Q}_1\mathbf{I})\cdots\bigr)\Bigr),\]

we can build \(\mathbf{Q}^T\) iteratively by starting with the identity and doing the same row operations as on \(\mathbf{A}\). Creating \(\mathbf{Q}^T\) with row operations on \(\mathbf{I}\) uses much less memory than building the \(\mathbf{Q}_k\) matrices explicitly.

The algorithm we have described is encapsulated in qrfact. There is one more refinement in it, though. As indicated by (74), the application of a reflector \(\mathbf{P}\) to a vector does not require the formation of the matrix \(\mathbf{P}\) itself. Its effect can be computed directly from the vector \(\mathbf{v}\), as is shown in qrfact.

Function 30 (qrfact)

QR factorization by Householder reflections.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
"""
qrfact(A)

QR factorization by Householder reflections. Returns Q and R.
"""
function qrfact(A)

    m,n = size(A)
    Qt = Matrix(Diagonal(ones(m)))
    R = float(copy(A))
    for k in 1:n
      z = R[k:m,k]
      v = [ -sign(z[1])*norm(z) - z[1]; -z[2:end] ]
      nrmv = norm(v)
      if nrmv < eps() continue; end  # skip this iteration
      v = v / nrmv;                  # simplifies other formulas
      # Apply the reflection to each relevant column of A and Q
      for j in 1:n
        R[k:m,j] -= v*( 2*(v'*R[k:m,j]) )
      end
      for j in 1:m
        Qt[k:m,j] -= v*( 2*(v'*Qt[k:m,j]) )
      end
    end
    
    return Qt',triu(R)
end

The Julia command qr works similarly to, but more efficiently than, qrfact. It finds the factorization in \(\sim(2m n^2-n^3/3)\) flops asymptotically.

Exercises

  1. ⌨ Find a Householder reflector \(\mathbf{P}\) such that

    \[\begin{split}\mathbf{P} \begin{bmatrix} -6 \\ 2 \\ 9 \end{bmatrix} = \begin{bmatrix} 11\\0\\0 \end{bmatrix}.\end{split}\]
  2. ✍ Prove the unfinished items in the reflector theorem, namely that a Householder reflector \(\mathbf{P}\) is symmetric and orthogonal.

  3. ✍ Let \(\mathbf{P}\) be a Householder reflector as in (73).

    (a) Find a vector \(\mathbf{u}\) such that \(\mathbf{P}\mathbf{u} = -\mathbf{u}\). (Fig. 4 may be of help.)

    (b) What algebraic condition is necessary and sufficient for a vector \(\mathbf{w}\) to satisfy \(\mathbf{P}\mathbf{w}=\mathbf{w}\)? In \(n\) dimensions, how many such linearly independent vectors are there?

  4. ✍ Under certain circumstances, computing the vector \(\mathbf{v}\) in (72) could lead to subtractive cancellation, which is why line 13 of qrfact reads as it does. Devise an example that causes subtractive cancellation if (72) is used.

  5. ✍ Suppose QR factorization is used to compute the solution of a square linear system, \(\mathbf{A}\mathbf{x}=\mathbf{b}\); i.~e., let \(m=n\).

    (a) Find an asymptotic flop count for this procedure, and compare to the LU factorization algorithm.

    (b) Show that \(\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})\).

  6. ✍ Show that \(\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})\) when \(\mathbf{A}\) is not square. (Hint: You can’t take an inverse of \(\mathbf{A}\) or \(\mathbf{R}\).)

  7. ⌨ Modify qrfact so that the loops in lines 18–23 are removed. In other words, update the matrices A and Q with a single statement for each. Test your code against the original code on an example to verify it.

  8. Another algorithmic technique for orthogonally introducing zeros into a matrix is the Givens rotation. Given a 2-vector \([\alpha\, \beta]\), it defines an angle \(\theta\) such that

    \[\begin{split}\begin{bmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} \alpha^2 + \beta^2 \\ 0 \end{bmatrix}.\end{split}\]

    (a) ✍ Given \(\alpha\) and \(\beta\), show how to compute \(\theta\).

    (b) ⌨ Given the vector \(\mathbf{z}=[1\;2\;3\;4\;5]^T\), use Julia to find a sequence of Givens rotations that transforms \(\mathbf{z}\) into the vector \(\| \mathbf{z} \|\mathbf{e}_1\). (Hint: You can operate only on pairs of elements at a time, introducing a zero at the lower of the two positions.)