Computing QR factorizations

To compute an LU factorization, we follow elimination rules to introduce zeros into the lower triangle of the matrix, leaving only the U factor. The row operations are themselves triangular and can be combined into the L factor. For the QR factorization, the game is again to introduce zeros into the lower triangle, but the rules have changed; now the row operations must be done orthogonally. Thanks to the orthogonal matrix theorem, the product of orthogonal operations will also be orthogonal, providing us with the final Q of the QR.

Householder reflections

A Householder reflector is a particular type of orthogonal matrix P. The reflection is customized for a particular given vector z so that Pz is nonzero only in the first element. Since orthogonal matrices preserve the 2-norm, we must have

(71)Pz=[±

(Recall that \mathbf{e}_k is the kth column of the identity matrix.) We choose the positive sign for our discussion, but see qrfact and an exercise for important computational details.

Given \mathbf{z}, let

(72) \mathbf{v} = \| \mathbf{z} \| \mathbf{e}_1-\mathbf{z}.

Then the reflector \mathbf{P} is defined by

(73) \mathbf{P} = \mathbf{I} - 2\frac{\mathbf{v} \mathbf{v}^T}{\mathbf{v}^T\mathbf{v}}

if \mathbf{v}\neq\boldsymbol{0}, or \mathbf{P}=\mathbf{I} if \mathbf{v}=\boldsymbol{0}. Note that \mathbf{v}^T\mathbf{v} is a scalar and can appear in a denominator, while the outer product \mathbf{v} \mathbf{v}^T is n\times n. It is straightforward to show that \mathbf{P} has the following key properties.

Theorem 29 ((Householder reflector))

Let \mathbf{v}=\| \mathbf{z} \|\mathbf{e}_1-\mathbf{z} and let \mathbf{P} be given by (73). Then \mathbf{P} is symmetric and orthogonal, and \mathbf{P}\mathbf{z}=\| \mathbf{z} \|\mathbf{e}_1.

Proof

The case \mathbf{v}=\boldsymbol{0} is obvious. For \mathbf{v}\neq\boldsymbol{0}, the proofs of symmetry and orthogonality are left to the exercises. As for the last fact, we simply compute

(74)\mathbf{P}\mathbf{z} = \mathbf{z} - 2 \frac{\mathbf{v} \mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} = \mathbf{z} - 2 \frac{\mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} \mathbf{v},

and, since \mathbf{e}_1^T\mathbf{z}=z_1,

\begin{split}\begin{split} \mathbf{v}^T\mathbf{v} &= \| \mathbf{z} \|^2 - 2 \| \mathbf{z} \| z_1 + \mathbf{z}^T\mathbf{z} = 2\| \mathbf{z} \|(\| \mathbf{z} \|-z_1),\\ \mathbf{v}^T\mathbf{z} &= \| \mathbf{z} \|z_1 - \mathbf{z}^T\mathbf{z} = -\| \mathbf{z} \|\bigl(\| \mathbf{z} \|-z_1\bigr), \end{split}\end{split}

leading finally to

\mathbf{P}\mathbf{z} = \mathbf{z} - 2\cdot \frac{-\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)}{2\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)} \mathbf{v} = \mathbf{z} + \mathbf{v} = \| \mathbf{z} \|\mathbf{e}_1.

The reason \mathbf{P} is called a reflector is sketched in Fig. 4.

../_images/hhreflect.svg

Fig. 4 Householder reflector.

The vector \mathbf{v} defines an n-1 dimensional subspace S perpendicular to it. Elementary vector analysis shows that \mathbf{v}(\mathbf{v}^T\mathbf{z})/(\mathbf{v}^T\mathbf{v}) is the vector projection of \mathbf{z} along \mathbf{v}. By subtracting this projection from \mathbf{z}, we end up in S, but by subtracting twice the projection we get a reflection through S. This reflection occurs when \mathbf{P} is applied to any vector, but when it is applied to \mathbf{z}, the result ends up on the x_1-axis.

Factorization algorithm

The QR factorization is computed by using successive Householder reflections to introduce zeros in one column at a time. We first show the process for a small numerical example in Householder QR.

You may be wondering what happened to the \mathbf{Q} in Householder QR. Each Householder reflector is orthogonal but not full-size. We have to pad it out to represent algebraically the fact that a block of the first rows are left alone. Given a reflector \mathbf{P}_k that is of square size m-k+1, we define

\begin{split}\mathbf{Q}_k = \begin{bmatrix} \mathbf{I}_{k-1} & \boldsymbol{0} \\ \boldsymbol{0} & \mathbf{P}_k \end{bmatrix}.\end{split}

It is trivial to show that \mathbf{Q}_k is also orthogonal. Then

\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 \mathbf{A} = \mathbf{R}.

But \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 is orthogonal too, and we multiply on the left by its transpose to get \mathbf{A}=\mathbf{Q}\mathbf{R}, where \mathbf{Q} = (\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1)^T. We don’t even need to form these matrices explicitly. Writing

\mathbf{Q}^T = \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 = \mathbf{Q}_n \Bigl( \mathbf{Q}_{n-1}\bigl(\cdots (\mathbf{Q}_1\mathbf{I})\cdots\bigr)\Bigr),

we can build \mathbf{Q}^T iteratively by starting with the identity and doing the same row operations as on \mathbf{A}. Creating \mathbf{Q}^T with row operations on \mathbf{I} uses much less memory than building the \mathbf{Q}_k matrices explicitly.

The algorithm we have described is encapsulated in qrfact. There is one more refinement in it, though. As indicated by (74), the application of a reflector \mathbf{P} to a vector does not require the formation of the matrix \mathbf{P} itself. Its effect can be computed directly from the vector \mathbf{v}, as is shown in qrfact.

Function 30 (qrfact)

QR factorization by Householder reflections.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
"""
qrfact(A)

QR factorization by Householder reflections. Returns Q and R.
"""
function qrfact(A)

    m,n = size(A)
    Qt = Matrix(Diagonal(ones(m)))
    R = float(copy(A))
    for k in 1:n
      z = R[k:m,k]
      v = [ -sign(z[1])*norm(z) - z[1]; -z[2:end] ]
      nrmv = norm(v)
      if nrmv < eps() continue; end  # skip this iteration
      v = v / nrmv;                  # simplifies other formulas
      # Apply the reflection to each relevant column of A and Q
      for j in 1:n
        R[k:m,j] -= v*( 2*(v'*R[k:m,j]) )
      end
      for j in 1:m
        Qt[k:m,j] -= v*( 2*(v'*Qt[k:m,j]) )
      end
    end
    
    return Qt',triu(R)
end
Copy to clipboard

The Julia command qr works similarly to, but more efficiently than, qrfact. It finds the factorization in \sim(2m n^2-n^3/3) flops asymptotically.

Exercises

  1. ⌨ Find a Householder reflector \mathbf{P} such that

    \begin{split}\mathbf{P} \begin{bmatrix} -6 \\ 2 \\ 9 \end{bmatrix} = \begin{bmatrix} 11\\0\\0 \end{bmatrix}.\end{split}
  2. ✍ Prove the unfinished items in the reflector theorem, namely that a Householder reflector \mathbf{P} is symmetric and orthogonal.

  3. ✍ Let \mathbf{P} be a Householder reflector as in (73).

    (a) Find a vector \mathbf{u} such that \mathbf{P}\mathbf{u} = -\mathbf{u}. (Fig. 4 may be of help.)

    (b) What algebraic condition is necessary and sufficient for a vector \mathbf{w} to satisfy \mathbf{P}\mathbf{w}=\mathbf{w}? In n dimensions, how many such linearly independent vectors are there?

  4. ✍ Under certain circumstances, computing the vector \mathbf{v} in (72) could lead to subtractive cancellation, which is why line 13 of qrfact reads as it does. Devise an example that causes subtractive cancellation if (72) is used.

  5. ✍ Suppose QR factorization is used to compute the solution of a square linear system, \mathbf{A}\mathbf{x}=\mathbf{b}; i.~e., let m=n.

    (a) Find an asymptotic flop count for this procedure, and compare to the LU factorization algorithm.

    (b) Show that \kappa_2(\mathbf{A})=\kappa_2(\mathbf{R}).

  6. ✍ Show that \kappa_2(\mathbf{A})=\kappa_2(\mathbf{R}) when \mathbf{A} is not square. (Hint: You can’t take an inverse of \mathbf{A} or \mathbf{R}.)

  7. ⌨ Modify qrfact so that the loops in lines 18–23 are removed. In other words, update the matrices A and Q with a single statement for each. Test your code against the original code on an example to verify it.

  8. Another algorithmic technique for orthogonally introducing zeros into a matrix is the Givens rotation. Given a 2-vector [\alpha\, \beta], it defines an angle \theta such that

    \begin{split}\begin{bmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} \alpha^2 + \beta^2 \\ 0 \end{bmatrix}.\end{split}

    (a) ✍ Given \alpha and \beta, show how to compute \theta.

    (b) ⌨ Given the vector \mathbf{z}=[1\;2\;3\;4\;5]^T, use Julia to find a sequence of Givens rotations that transforms \mathbf{z} into the vector \| \mathbf{z} \|\mathbf{e}_1. (Hint: You can operate only on pairs of elements at a time, introducing a zero at the lower of the two positions.)