Computing QR factorizations¶
To compute an LU factorization, we follow elimination rules to introduce zeros into the lower triangle of the matrix, leaving only the \(\mathbf{U}\) factor. The row operations are themselves triangular and can be combined into the \(\mathbf{L}\) factor. For the QR factorization, the game is again to introduce zeros into the lower triangle, but the rules have changed; now the row operations must be done orthogonally. Thanks to the orthogonal matrix theorem, the product of orthogonal operations will also be orthogonal, providing us with the final \(\mathbf{Q}\) of the QR.
Householder reflections¶
A Householder reflector is a particular type of orthogonal matrix \(\mathbf{P}\). The reflection is customized for a particular given vector \(\mathbf{z}\) so that \(\mathbf{P}\mathbf{z}\) is nonzero only in the first element. Since orthogonal matrices preserve the 2-norm, we must have
(Recall that \(\mathbf{e}_k\) is the \(k\)th column of the identity matrix.) We choose the positive sign for our discussion, but see qrfact and an exercise for important computational details.
Given \(\mathbf{z}\), let
Then the reflector \(\mathbf{P}\) is defined by
if \(\mathbf{v}\neq\boldsymbol{0}\), or \(\mathbf{P}=\mathbf{I}\) if \(\mathbf{v}=\boldsymbol{0}\). Note that \(\mathbf{v}^T\mathbf{v}\) is a scalar and can appear in a denominator, while the outer product \(\mathbf{v} \mathbf{v}^T\) is \(n\times n\). It is straightforward to show that \(\mathbf{P}\) has the following key properties.
Let \(\mathbf{v}=\| \mathbf{z} \|\mathbf{e}_1-\mathbf{z}\) and let \(\mathbf{P}\) be given by (73). Then \(\mathbf{P}\) is symmetric and orthogonal, and \(\mathbf{P}\mathbf{z}=\| \mathbf{z} \|\mathbf{e}_1\).
The case \(\mathbf{v}=\boldsymbol{0}\) is obvious. For \(\mathbf{v}\neq\boldsymbol{0}\), the proofs of symmetry and orthogonality are left to the exercises. As for the last fact, we simply compute
and, since \(\mathbf{e}_1^T\mathbf{z}=z_1\),
leading finally to
The reason \(\mathbf{P}\) is called a reflector is sketched in Fig. 4.
The vector \(\mathbf{v}\) defines an \(n-1\) dimensional subspace \(S\) perpendicular to it. Elementary vector analysis shows that \(\mathbf{v}(\mathbf{v}^T\mathbf{z})/(\mathbf{v}^T\mathbf{v})\) is the vector projection of \(\mathbf{z}\) along \(\mathbf{v}\). By subtracting this projection from \(\mathbf{z}\), we end up in \(S\), but by subtracting twice the projection we get a reflection through \(S\). This reflection occurs when \(\mathbf{P}\) is applied to any vector, but when it is applied to \(\mathbf{z}\), the result ends up on the \(x_1\)-axis.
Factorization algorithm¶
The QR factorization is computed by using successive Householder reflections to introduce zeros in one column at a time. We first show the process for a small numerical example in Householder QR.
You may be wondering what happened to the \(\mathbf{Q}\) in Householder QR. Each Householder reflector is orthogonal but not full-size. We have to pad it out to represent algebraically the fact that a block of the first rows are left alone. Given a reflector \(\mathbf{P}_k\) that is of square size \(m-k+1\), we define
It is trivial to show that \(\mathbf{Q}_k\) is also orthogonal. Then
But \(\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1\) is orthogonal too, and we multiply on the left by its transpose to get \(\mathbf{A}=\mathbf{Q}\mathbf{R}\), where \(\mathbf{Q} = (\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1)^T\). We don’t even need to form these matrices explicitly. Writing
we can build \(\mathbf{Q}^T\) iteratively by starting with the identity and doing the same row operations as on \(\mathbf{A}\). Creating \(\mathbf{Q}^T\) with row operations on \(\mathbf{I}\) uses much less memory than building the \(\mathbf{Q}_k\) matrices explicitly.
The algorithm we have described is encapsulated in qrfact. There is one more refinement in it, though. As indicated by (74), the application of a reflector \(\mathbf{P}\) to a vector does not require the formation of the matrix \(\mathbf{P}\) itself. Its effect can be computed directly from the vector \(\mathbf{v}\), as is shown in qrfact.
QR factorization by Householder reflections.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | """
qrfact(A)
QR factorization by Householder reflections. Returns Q and R.
"""
function qrfact(A)
m,n = size(A)
Qt = Matrix(Diagonal(ones(m)))
R = float(copy(A))
for k in 1:n
z = R[k:m,k]
v = [ -sign(z[1])*norm(z) - z[1]; -z[2:end] ]
nrmv = norm(v)
if nrmv < eps() continue; end # skip this iteration
v = v / nrmv; # simplifies other formulas
# Apply the reflection to each relevant column of A and Q
for j in 1:n
R[k:m,j] -= v*( 2*(v'*R[k:m,j]) )
end
for j in 1:m
Qt[k:m,j] -= v*( 2*(v'*Qt[k:m,j]) )
end
end
return Qt',triu(R)
end
|
The Julia command qr
works similarly to, but more efficiently than, qrfact. It finds the factorization in \(\sim(2m n^2-n^3/3)\) flops asymptotically.
Exercises¶
⌨ Find a Householder reflector \(\mathbf{P}\) such that
\[\begin{split}\mathbf{P} \begin{bmatrix} -6 \\ 2 \\ 9 \end{bmatrix} = \begin{bmatrix} 11\\0\\0 \end{bmatrix}.\end{split}\]✍ Prove the unfinished items in the reflector theorem, namely that a Householder reflector \(\mathbf{P}\) is symmetric and orthogonal.
✍ Let \(\mathbf{P}\) be a Householder reflector as in (73).
(a) Find a vector \(\mathbf{u}\) such that \(\mathbf{P}\mathbf{u} = -\mathbf{u}\). (Fig. 4 may be of help.)
(b) What algebraic condition is necessary and sufficient for a vector \(\mathbf{w}\) to satisfy \(\mathbf{P}\mathbf{w}=\mathbf{w}\)? In \(n\) dimensions, how many such linearly independent vectors are there?
✍ Under certain circumstances, computing the vector \(\mathbf{v}\) in (72) could lead to subtractive cancellation, which is why line 13 of qrfact reads as it does. Devise an example that causes subtractive cancellation if (72) is used.
✍ Suppose QR factorization is used to compute the solution of a square linear system, \(\mathbf{A}\mathbf{x}=\mathbf{b}\); i.~e., let \(m=n\).
(a) Find an asymptotic flop count for this procedure, and compare to the LU factorization algorithm.
(b) Show that \(\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})\).
✍ Show that \(\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})\) when \(\mathbf{A}\) is not square. (Hint: You can’t take an inverse of \(\mathbf{A}\) or \(\mathbf{R}\).)
⌨ Modify qrfact so that the loops in lines 18–23 are removed. In other words, update the matrices
A
andQ
with a single statement for each. Test your code against the original code on an example to verify it.Another algorithmic technique for orthogonally introducing zeros into a matrix is the Givens rotation. Given a 2-vector \([\alpha\, \beta]\), it defines an angle \(\theta\) such that
\[\begin{split}\begin{bmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} \alpha^2 + \beta^2 \\ 0 \end{bmatrix}.\end{split}\](a) ✍ Given \(\alpha\) and \(\beta\), show how to compute \(\theta\).
(b) ⌨ Given the vector \(\mathbf{z}=[1\;2\;3\;4\;5]^T\), use Julia to find a sequence of Givens rotations that transforms \(\mathbf{z}\) into the vector \(\| \mathbf{z} \|\mathbf{e}_1\). (Hint: You can operate only on pairs of elements at a time, introducing a zero at the lower of the two positions.)