Skip to article frontmatterSkip to article content

Symmetry and definiteness

As we saw in Exploiting matrix structure, symmetry can simplify the LU factorization into the symmetric form A=LDLT\mathbf{A}=\mathbf{L}\mathbf{D}\mathbf{L}^T. Important specializations occur as well for the eigenvalue and singular value factorizations. In this section we stay with complex-valued matrices, so we are interested in the case when A=A\mathbf{A}^*=\mathbf{A}, i.e., A\mathbf{A} is hermitian. However, we often loosely speak of symmetry to mean this property even in the complex case. All of the statements in this section easily specialize to the real case.

7.4.1Normality

Suppose now that A=A\mathbf{A}^*=\mathbf{A} and that A=USV\mathbf{A}=\mathbf{U}\mathbf{S}\mathbf{V}^* is an SVD. Since S\mathbf{S} is real and square, we have

A=VSU=VSU,\mathbf{A}^* = \mathbf{V} \mathbf{S}^* \mathbf{U}^* = \mathbf{V} \mathbf{S} \mathbf{U}^*,

and it’s tempting to conclude that U=V\mathbf{U}=\mathbf{V}. Happily, this is nearly true. The following theorem is typically proved in an advanced linear algebra course.

Another way to state the result of this theorem is that a hermitian matrix has real eigenvalues and a complete set of orthonormal eigenvectors—that is, it is normal. Because hermitian matrices are normal, their eigenvalue condition number is guaranteed to be 1 by Theorem 7.2.3.

The converse of Theorem 7.4.1 is also true: every normal matrix with real eigenvalues is hermitian. This was illustrated in Demo 7.2.3.

For a hermitian matrix, the EVD

A=VDV1=VDV\mathbf{A}=\mathbf{V}\mathbf{D}\mathbf{V}^{-1}=\mathbf{V} \mathbf{D} \mathbf{V}^*

is almost an SVD.

7.4.2Rayleigh quotient

Recall that for a matrix A\mathbf{A} and compatible vector x\mathbf{x}, the quadratic form xAx\mathbf{x}^* \mathbf{A} \mathbf{x} is a scalar.

If v\mathbf{v} is an eigenvector such that Av=λv\mathbf{A} \mathbf{v}=\lambda \mathbf{v}, then one easily calculates that RA(v)=λ.R_{\mathbf{A}}(\mathbf{v})=\lambda. That is, the Rayleigh quotient maps an eigenvector into its associated eigenvalue.

If A=A\mathbf{A}^*=\mathbf{A}, then the Rayleigh quotient has another interesting property: RA(v)=0\nabla R_{\mathbf{A}}(\mathbf{v})=\boldsymbol{0} if v\mathbf{v} is an eigenvector. By a multidimensional Taylor series, then,

RA(v+ϵz)=RA(v)+0+O(ϵ2)=λ+O(ϵ2),R_{\mathbf{A}}(\mathbf{v}+\epsilon\mathbf{z}) = R_{\mathbf{A}}(\mathbf{v}) + 0 + O( \epsilon^2) = \lambda + O( \epsilon^2),

as ϵ0\epsilon\to 0. The conclusion is that a good estimate of an eigenvector becomes an even better estimate of an eigenvalue.

7.4.3Definite, semidefinite, and indefinite matrices

In the real case, we called a symmetric matrix A\mathbf{A} symmetric positive definite (SPD) if xTAx>0\mathbf{x}^T \mathbf{A}\mathbf{x} > 0 for all nonzero vectors x\mathbf{x}. In the complex case the relevant property is hermitian positive definite (HPD), meaning that A=A\mathbf{A}^*=\mathbf{A} and xAx>0\mathbf{x}^* \mathbf{A}\mathbf{x} > 0 for all complex vectors x\mathbf{x}. Putting this property together with the Rayleigh quotient leads to the following.

According to Theorem 7.4.3, for an HPD matrix, the EVD A=VDV\mathbf{A}=\mathbf{V}\mathbf{D}\mathbf{V}^* meets all the requirements of the SVD, provided the ordering of eigenvalues is chosen appropriately.

A hermitian matrix with all negative eigenvalues is called negative definite, and one with eigenvalues of different signs is indefinite. Finally, if one or more eigenvalues is zero and the rest have one sign, it is positive or negative semidefinite.

7.4.4Exercises

  1. ✍ Each line below is an EVD for a hermitian matrix. State whether the matrix is definite, indefinite, or semidefinite. Then state whether the given factorization is also an SVD, and if it is not, modify it to find an SVD.

    (a) [0001]=[0110][1000][0110]\begin{bmatrix} 0 & 0 \\ 0 & -1 \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} -1 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}

    (b) [4221]=[10.50.51][5000][0.80.40.40.8]\begin{bmatrix} 4 & -2 \\ -2 & 1 \end{bmatrix} = \begin{bmatrix} 1 & -0.5 \\ -0.5 & -1 \end{bmatrix} \begin{bmatrix} 5 & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0.8 & -0.4 \\ -0.4 & -0.8 \end{bmatrix}

    (c) [5335]=[αααα][2008][αααα],α=1/2\begin{bmatrix} -5 & 3\\ 3 & -5 \end{bmatrix} = \begin{bmatrix} \alpha & \alpha \\ \alpha & -\alpha \end{bmatrix} \begin{bmatrix} -2 & 0 \\ 0 & -8 \end{bmatrix} \begin{bmatrix} \alpha & \alpha \\ \alpha & -\alpha \end{bmatrix}, \quad\alpha=1/\sqrt{2}

  2. ⌨ The matrix names below are found in MatrixDepot for Julia, gallery for MATLAB, and rogues for Python. You will have to adjust the syntax accordingly. For each matrix, determine whether it is positive definite, negative definite, positive or negative semidefinite, or indefinite.

    (a) pei(5) 6I - 6 \mathbf{I}

    (b) hilb(8) 2I - 2 \mathbf{I}

    (c) dingdong(20)

    (d) lehmer(100)

    (e) fiedler(200)

  3. ✍ Prove true, or give a counterexample: If A\mathbf{A} and B\mathbf{B} are hermitian matrices of the same size, then

    RA+B(x)=RA(x)+RB(x).R_{\mathbf{A}+\mathbf{B}}(\mathbf{x}) = R_{\mathbf{A}}(\mathbf{x})+R_{\mathbf{B}}(\mathbf{x}).
  4. ⌨ The range of the function RA(x)R_{\mathbf{A}}(\mathbf{x}) is a subset of the complex plane known as the field of values of the matrix A\mathbf{A}. Use 500 random vectors to plot points in the field of values of A=[102020201]\mathbf{A} = \displaystyle \begin{bmatrix} 1 & 0 & -2\\ 0 & 2 & 0\\ -2 & 0 & 1 \end{bmatrix}. Then compute its eigenvalues and guess what the exact field of values is.

  5. ✍ Let A=[3220].\mathbf{A}=\displaystyle \begin{bmatrix} 3 & -2 \\ -2 & 0 \end{bmatrix}.

    (a) Write out RA(x)R_{\mathbf{A}}(\mathbf{x}) explicitly as a function of x1x_1 and x2x_2.

    (b) Find RA(x)R_{\mathbf{A}}(\mathbf{x}) for x1=1x_1=1, x2=2x_2=2.

    (c) Find the gradient vector RA(x)\nabla R_{\mathbf{A}}(\mathbf{x}).

    (d) Show that the gradient vector is zero when x1=1x_1=1, x2=2x_2=2.

  6. ✍ A skew-Hermitian matrix is one that satisfies A=A\mathbf{A}^*=-\mathbf{A}. Show that if A\mathbf{A} is skew-Hermitian, then RAR_{\mathbf{A}} is imaginary-valued.

  7. ⌨ Thanks largely to Theorem 7.4.1, the eigenvalue problem for symmetric/hermitian matrices is easier than for general matrices.

    (a) Let A\mathbf{A} be a 1000×10001000\times 1000 random real matrix, and let S=A+AT\mathbf{S}=\mathbf{A}+\mathbf{A}^T. Time finding the eigenvalues of A\mathbf{A} and then of S\mathbf{S}. You should find that the computation for S\mathbf{S} is around an order of magnitude faster.

    (b) Perform the experiment from part (a) on n×nn\times n matrices for n=200,300,,1600n=200,300,\ldots,1600. Plot running time as a function of nn for both matrices on a single log-log plot. Is the ratio of running times roughly constant, or does it grow with nn?