Computing QR factorizations¶
To compute an LU factorization, we follow elimination rules to introduce zeros into the lower triangle of the matrix, leaving only the U factor. The row operations are themselves triangular and can be combined into the L factor. For the QR factorization, the game is again to introduce zeros into the lower triangle, but the rules have changed; now the row operations must be done orthogonally. Thanks to the orthogonal matrix theorem, the product of orthogonal operations will also be orthogonal, providing us with the final Q of the QR.
Householder reflections¶
A Householder reflector is a particular type of orthogonal matrix P. The reflection is customized for a particular given vector z so that Pz is nonzero only in the first element. Since orthogonal matrices preserve the 2-norm, we must have
(Recall that \mathbf{e}_k is the kth column of the identity matrix.) We choose the positive sign for our discussion, but see qrfact and an exercise for important computational details.
Given \mathbf{z}, let
Then the reflector \mathbf{P} is defined by
if \mathbf{v}\neq\boldsymbol{0}, or \mathbf{P}=\mathbf{I} if \mathbf{v}=\boldsymbol{0}. Note that \mathbf{v}^T\mathbf{v} is a scalar and can appear in a denominator, while the outer product \mathbf{v} \mathbf{v}^T is n\times n. It is straightforward to show that \mathbf{P} has the following key properties.
Let \mathbf{v}=\| \mathbf{z} \|\mathbf{e}_1-\mathbf{z} and let \mathbf{P} be given by (73). Then \mathbf{P} is symmetric and orthogonal, and \mathbf{P}\mathbf{z}=\| \mathbf{z} \|\mathbf{e}_1.
The case \mathbf{v}=\boldsymbol{0} is obvious. For \mathbf{v}\neq\boldsymbol{0}, the proofs of symmetry and orthogonality are left to the exercises. As for the last fact, we simply compute
and, since \mathbf{e}_1^T\mathbf{z}=z_1,
leading finally to
The reason \mathbf{P} is called a reflector is sketched in Fig. 4.
Fig. 4 Householder reflector.¶
The vector \mathbf{v} defines an n-1 dimensional subspace S perpendicular to it. Elementary vector analysis shows that \mathbf{v}(\mathbf{v}^T\mathbf{z})/(\mathbf{v}^T\mathbf{v}) is the vector projection of \mathbf{z} along \mathbf{v}. By subtracting this projection from \mathbf{z}, we end up in S, but by subtracting twice the projection we get a reflection through S. This reflection occurs when \mathbf{P} is applied to any vector, but when it is applied to \mathbf{z}, the result ends up on the x_1-axis.
Factorization algorithm¶
The QR factorization is computed by using successive Householder reflections to introduce zeros in one column at a time. We first show the process for a small numerical example in Householder QR.
You may be wondering what happened to the \mathbf{Q} in Householder QR. Each Householder reflector is orthogonal but not full-size. We have to pad it out to represent algebraically the fact that a block of the first rows are left alone. Given a reflector \mathbf{P}_k that is of square size m-k+1, we define
It is trivial to show that \mathbf{Q}_k is also orthogonal. Then
But \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 is orthogonal too, and we multiply on the left by its transpose to get \mathbf{A}=\mathbf{Q}\mathbf{R}, where \mathbf{Q} = (\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1)^T. We don’t even need to form these matrices explicitly. Writing
we can build \mathbf{Q}^T iteratively by starting with the identity and doing the same row operations as on \mathbf{A}. Creating \mathbf{Q}^T with row operations on \mathbf{I} uses much less memory than building the \mathbf{Q}_k matrices explicitly.
The algorithm we have described is encapsulated in qrfact. There is one more refinement in it, though. As indicated by (74), the application of a reflector \mathbf{P} to a vector does not require the formation of the matrix \mathbf{P} itself. Its effect can be computed directly from the vector \mathbf{v}, as is shown in qrfact.
QR factorization by Householder reflections.
The Julia command qr
works similarly to, but more efficiently than, qrfact. It finds the factorization in \sim(2m n^2-n^3/3) flops asymptotically.
Exercises¶
⌨ Find a Householder reflector \mathbf{P} such that
\begin{split}\mathbf{P} \begin{bmatrix} -6 \\ 2 \\ 9 \end{bmatrix} = \begin{bmatrix} 11\\0\\0 \end{bmatrix}.\end{split}✍ Prove the unfinished items in the reflector theorem, namely that a Householder reflector \mathbf{P} is symmetric and orthogonal.
✍ Let \mathbf{P} be a Householder reflector as in (73).
(a) Find a vector \mathbf{u} such that \mathbf{P}\mathbf{u} = -\mathbf{u}. (Fig. 4 may be of help.)
(b) What algebraic condition is necessary and sufficient for a vector \mathbf{w} to satisfy \mathbf{P}\mathbf{w}=\mathbf{w}? In n dimensions, how many such linearly independent vectors are there?
✍ Under certain circumstances, computing the vector \mathbf{v} in (72) could lead to subtractive cancellation, which is why line 13 of qrfact reads as it does. Devise an example that causes subtractive cancellation if (72) is used.
✍ Suppose QR factorization is used to compute the solution of a square linear system, \mathbf{A}\mathbf{x}=\mathbf{b}; i.~e., let m=n.
(a) Find an asymptotic flop count for this procedure, and compare to the LU factorization algorithm.
(b) Show that \kappa_2(\mathbf{A})=\kappa_2(\mathbf{R}).
✍ Show that \kappa_2(\mathbf{A})=\kappa_2(\mathbf{R}) when \mathbf{A} is not square. (Hint: You can’t take an inverse of \mathbf{A} or \mathbf{R}.)
⌨ Modify qrfact so that the loops in lines 18–23 are removed. In other words, update the matrices
A
andQ
with a single statement for each. Test your code against the original code on an example to verify it.Another algorithmic technique for orthogonally introducing zeros into a matrix is the Givens rotation. Given a 2-vector [\alpha\, \beta], it defines an angle \theta such that
\begin{split}\begin{bmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} \alpha^2 + \beta^2 \\ 0 \end{bmatrix}.\end{split}(a) ✍ Given \alpha and \beta, show how to compute \theta.
(b) ⌨ Given the vector \mathbf{z}=[1\;2\;3\;4\;5]^T, use Julia to find a sequence of Givens rotations that transforms \mathbf{z} into the vector \| \mathbf{z} \|\mathbf{e}_1. (Hint: You can operate only on pairs of elements at a time, introducing a zero at the lower of the two positions.)