# Computing QR factorizations¶

To compute an LU factorization, we follow elimination rules to introduce zeros into the lower triangle of the matrix, leaving only the $$\mathbf{U}$$ factor. The row operations are themselves triangular and can be combined into the $$\mathbf{L}$$ factor. For the QR factorization, the game is again to introduce zeros into the lower triangle, but the rules have changed; now the row operations must be done orthogonally. Thanks to the orthogonal matrix theorem, the product of orthogonal operations will also be orthogonal, providing us with the final $$\mathbf{Q}$$ of the QR.

## Householder reflections¶

A Householder reflector is a particular type of orthogonal matrix $$\mathbf{P}$$. The reflection is customized for a particular given vector $$\mathbf{z}$$ so that $$\mathbf{P}\mathbf{z}$$ is nonzero only in the first element. Since orthogonal matrices preserve the 2-norm, we must have

(71)$\begin{split}\mathbf{P}\mathbf{z} = \begin{bmatrix} \pm \| \mathbf{z} \|\\0 \\ \vdots \\ 0 \end{bmatrix} = \pm \| \mathbf{z} \| \mathbf{e}_1.\end{split}$

(Recall that $$\mathbf{e}_k$$ is the $$k$$th column of the identity matrix.) We choose the positive sign for our discussion, but see qrfact and an exercise for important computational details.

Given $$\mathbf{z}$$, let

(72)$\mathbf{v} = \| \mathbf{z} \| \mathbf{e}_1-\mathbf{z}.$

Then the reflector $$\mathbf{P}$$ is defined by

(73)$\mathbf{P} = \mathbf{I} - 2\frac{\mathbf{v} \mathbf{v}^T}{\mathbf{v}^T\mathbf{v}}$

if $$\mathbf{v}\neq\boldsymbol{0}$$, or $$\mathbf{P}=\mathbf{I}$$ if $$\mathbf{v}=\boldsymbol{0}$$. Note that $$\mathbf{v}^T\mathbf{v}$$ is a scalar and can appear in a denominator, while the outer product $$\mathbf{v} \mathbf{v}^T$$ is $$n\times n$$. It is straightforward to show that $$\mathbf{P}$$ has the following key properties.

Theorem 29 ((Householder reflector))

Let $$\mathbf{v}=\| \mathbf{z} \|\mathbf{e}_1-\mathbf{z}$$ and let $$\mathbf{P}$$ be given by (73). Then $$\mathbf{P}$$ is symmetric and orthogonal, and $$\mathbf{P}\mathbf{z}=\| \mathbf{z} \|\mathbf{e}_1$$.

Proof

The case $$\mathbf{v}=\boldsymbol{0}$$ is obvious. For $$\mathbf{v}\neq\boldsymbol{0}$$, the proofs of symmetry and orthogonality are left to the exercises. As for the last fact, we simply compute

(74)$\mathbf{P}\mathbf{z} = \mathbf{z} - 2 \frac{\mathbf{v} \mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} = \mathbf{z} - 2 \frac{\mathbf{v}^T \mathbf{z}}{\mathbf{v}^T\mathbf{v}} \mathbf{v},$

and, since $$\mathbf{e}_1^T\mathbf{z}=z_1$$,

$\begin{split}\begin{split} \mathbf{v}^T\mathbf{v} &= \| \mathbf{z} \|^2 - 2 \| \mathbf{z} \| z_1 + \mathbf{z}^T\mathbf{z} = 2\| \mathbf{z} \|(\| \mathbf{z} \|-z_1),\\ \mathbf{v}^T\mathbf{z} &= \| \mathbf{z} \|z_1 - \mathbf{z}^T\mathbf{z} = -\| \mathbf{z} \|\bigl(\| \mathbf{z} \|-z_1\bigr), \end{split}\end{split}$

$\mathbf{P}\mathbf{z} = \mathbf{z} - 2\cdot \frac{-\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)}{2\| \mathbf{z} \| \bigl(\| \mathbf{z} \|-z_1\bigr)} \mathbf{v} = \mathbf{z} + \mathbf{v} = \| \mathbf{z} \|\mathbf{e}_1.$

The reason $$\mathbf{P}$$ is called a reflector is sketched in Fig. 4. Fig. 4 Householder reflector.

The vector $$\mathbf{v}$$ defines an $$n-1$$ dimensional subspace $$S$$ perpendicular to it. Elementary vector analysis shows that $$\mathbf{v}(\mathbf{v}^T\mathbf{z})/(\mathbf{v}^T\mathbf{v})$$ is the vector projection of $$\mathbf{z}$$ along $$\mathbf{v}$$. By subtracting this projection from $$\mathbf{z}$$, we end up in $$S$$, but by subtracting twice the projection we get a reflection through $$S$$. This reflection occurs when $$\mathbf{P}$$ is applied to any vector, but when it is applied to $$\mathbf{z}$$, the result ends up on the $$x_1$$-axis.

## Factorization algorithm¶

The QR factorization is computed by using successive Householder reflections to introduce zeros in one column at a time. We first show the process for a small numerical example in Householder QR.

You may be wondering what happened to the $$\mathbf{Q}$$ in Householder QR. Each Householder reflector is orthogonal but not full-size. We have to pad it out to represent algebraically the fact that a block of the first rows are left alone. Given a reflector $$\mathbf{P}_k$$ that is of square size $$m-k+1$$, we define

$\begin{split}\mathbf{Q}_k = \begin{bmatrix} \mathbf{I}_{k-1} & \boldsymbol{0} \\ \boldsymbol{0} & \mathbf{P}_k \end{bmatrix}.\end{split}$

It is trivial to show that $$\mathbf{Q}_k$$ is also orthogonal. Then

$\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 \mathbf{A} = \mathbf{R}.$

But $$\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1$$ is orthogonal too, and we multiply on the left by its transpose to get $$\mathbf{A}=\mathbf{Q}\mathbf{R}$$, where $$\mathbf{Q} = (\mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1)^T$$. We don’t even need to form these matrices explicitly. Writing

$\mathbf{Q}^T = \mathbf{Q}_n \mathbf{Q}_{n-1}\cdots \mathbf{Q}_1 = \mathbf{Q}_n \Bigl( \mathbf{Q}_{n-1}\bigl(\cdots (\mathbf{Q}_1\mathbf{I})\cdots\bigr)\Bigr),$

we can build $$\mathbf{Q}^T$$ iteratively by starting with the identity and doing the same row operations as on $$\mathbf{A}$$. Creating $$\mathbf{Q}^T$$ with row operations on $$\mathbf{I}$$ uses much less memory than building the $$\mathbf{Q}_k$$ matrices explicitly.

The algorithm we have described is encapsulated in qrfact. There is one more refinement in it, though. As indicated by (74), the application of a reflector $$\mathbf{P}$$ to a vector does not require the formation of the matrix $$\mathbf{P}$$ itself. Its effect can be computed directly from the vector $$\mathbf{v}$$, as is shown in qrfact.

Function 30 (qrfact)

QR factorization by Householder reflections.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 """ qrfact(A) QR factorization by Householder reflections. Returns Q and R. """ function qrfact(A) m,n = size(A) Qt = Matrix(Diagonal(ones(m))) R = float(copy(A)) for k in 1:n z = R[k:m,k] v = [ -sign(z)*norm(z) - z; -z[2:end] ] nrmv = norm(v) if nrmv < eps() continue; end # skip this iteration v = v / nrmv; # simplifies other formulas # Apply the reflection to each relevant column of A and Q for j in 1:n R[k:m,j] -= v*( 2*(v'*R[k:m,j]) ) end for j in 1:m Qt[k:m,j] -= v*( 2*(v'*Qt[k:m,j]) ) end end return Qt',triu(R) end 

The Julia command qr works similarly to, but more efficiently than, qrfact. It finds the factorization in $$\sim(2m n^2-n^3/3)$$ flops asymptotically.

## Exercises¶

1. ⌨ Find a Householder reflector $$\mathbf{P}$$ such that

$\begin{split}\mathbf{P} \begin{bmatrix} -6 \\ 2 \\ 9 \end{bmatrix} = \begin{bmatrix} 11\\0\\0 \end{bmatrix}.\end{split}$
2. ✍ Prove the unfinished items in the reflector theorem, namely that a Householder reflector $$\mathbf{P}$$ is symmetric and orthogonal.

3. ✍ Let $$\mathbf{P}$$ be a Householder reflector as in (73).

(a) Find a vector $$\mathbf{u}$$ such that $$\mathbf{P}\mathbf{u} = -\mathbf{u}$$. (Fig. 4 may be of help.)

(b) What algebraic condition is necessary and sufficient for a vector $$\mathbf{w}$$ to satisfy $$\mathbf{P}\mathbf{w}=\mathbf{w}$$? In $$n$$ dimensions, how many such linearly independent vectors are there?

4. ✍ Under certain circumstances, computing the vector $$\mathbf{v}$$ in (72) could lead to subtractive cancellation, which is why line 13 of qrfact reads as it does. Devise an example that causes subtractive cancellation if (72) is used.

5. ✍ Suppose QR factorization is used to compute the solution of a square linear system, $$\mathbf{A}\mathbf{x}=\mathbf{b}$$; i.~e., let $$m=n$$.

(a) Find an asymptotic flop count for this procedure, and compare to the LU factorization algorithm.

(b) Show that $$\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})$$.

6. ✍ Show that $$\kappa_2(\mathbf{A})=\kappa_2(\mathbf{R})$$ when $$\mathbf{A}$$ is not square. (Hint: You can’t take an inverse of $$\mathbf{A}$$ or $$\mathbf{R}$$.)

7. ⌨ Modify qrfact so that the loops in lines 18–23 are removed. In other words, update the matrices A and Q with a single statement for each. Test your code against the original code on an example to verify it.

8. Another algorithmic technique for orthogonally introducing zeros into a matrix is the Givens rotation. Given a 2-vector $$[\alpha\, \beta]$$, it defines an angle $$\theta$$ such that

$\begin{split}\begin{bmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} \alpha^2 + \beta^2 \\ 0 \end{bmatrix}.\end{split}$

(a) ✍ Given $$\alpha$$ and $$\beta$$, show how to compute $$\theta$$.

(b) ⌨ Given the vector $$\mathbf{z}=[1\;2\;3\;4\;5]^T$$, use Julia to find a sequence of Givens rotations that transforms $$\mathbf{z}$$ into the vector $$\| \mathbf{z} \|\mathbf{e}_1$$. (Hint: You can operate only on pairs of elements at a time, introducing a zero at the lower of the two positions.)