Link Search Menu Expand Document

QR factorizations

Recall that each elementary row operation in Gaussian Elimination could be interpreted as multiplication by a particular matrix. When thinking this way, you realize you are actually factoring a matrix when running Gaussian Elimination. This realization results in the LU factorizations with pivoting.

Similarly, we can view Gram Schmidt as a factorization, called a QR factorization.

First we notice that we can unfold the recursive definition given for Gram-Schmidt to write: v1=v1,w1w1v2=v2,w2w2+v2,w1w1v3=v3,w3w3+v3,w2w2+v3,w1w1vd=vd,wdwd+vd,wd1wd1++vd,w1w1\begin{aligned} v_1 & = \langle v_1, w_1 \rangle w_1 \\ v_2 & = \langle v_2, w_2 \rangle w_2 + \langle v_2, w_1 \rangle w_1 \\ v_3 & = \langle v_3, w_3 \rangle w_3 + \langle v_3, w_2 \rangle w_2 + \langle v_3,w_1 \rangle w_1 \\ & \vdots \\ v_d & = \langle v_d, w_d \rangle w_d + \langle v_d, w_{d-1} \rangle w_{d-1} + \cdots + \langle v_d, w_1 \rangle w_1 \end{aligned}

We can combine the uiu_i as columns of a single matrix QQ and we let RR be the matrix with Rij:=vj,wiR_{ij} := \langle v_j, w_i \rangle

The matrix RR from its definition is upper triangular. For this reason, the QR decomposition is sometimes called the QU decomposition.

The matrix QQ is an orthogonal matrix. Recall from this means the transpose of QQ is its inverse: QT=Q1Q^T = Q^{-1}

Lemma. A n×nn \times n matrix AA is orthogonal if and only if its columns form an orthonormal basis.

Proof. (Expand to view)

Let’s look at QTQQ^TQ. The i,j entry of QTQQ^TQ is q1jq1i+q2jq2i++qnjqni=CjTCiq_{1j}q_{1i} + q_{2j}q_{2i} + \cdots + q_{nj}q_{ni} = \mathbf{C}_j^T \mathbf{C}_i If we want this to be the identity matrix, it is equivalent to requiring that
CjTCi={1i=j0ij\mathbf{C}_j^T \mathbf{C}_i = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases} But this is exactly expressing that the vectors C1,,Cn\mathbf{C}_1,\ldots,\mathbf{C}_n are an orthonormal set in Rn\mathbb{R}^n. Any orthonormal set is linearly independent and we have as many as the dimension of Rn\mathbf{R}^n. Thus we have a basis.

Knowing this, we have the following.

Theorem(QR Decomposition). For an invertible n×nn \times n matrix AA with entries in R\mathbb{R}, there exists an orthogonal matrix QQ and an upper triangular matrix RR with A=QRA = QR

Proof. (Expand to view)

We take QQ to have columns given by applying Gram-Schmidt to the columns of AA and taking RR to be the matrix of pairings between columns of QQ and those of AA as above.

There is another useful characterization of orthogonal matrices. Given an inner product ,\langle -,- \rangle on a vector space knk^n, we say that an n×nn \times n matrix AA preserves the inner product if Aw,Av=w,v\langle A w, A v \rangle = \langle w, v \rangle In other words, applying AA to the two inputs and then taking the product returns the same number as just applying the product to the two inputs directly.

Lemma. An n×nn \times n matrix AA preserves an the standard inner product on knk^n if and only if AA is orthogonal.

Proof. (Expand to view)

We have Av,Aw=(Av)TAw=vTATAw.\langle Av , Aw \rangle = (Av)^T Aw = v^T A^TA w. If AA is orthogonal, then this is equal to vTw=v,wv^T w = \langle v , w \rangle.

If we assume Av,Aw=v,w\langle Av, Aw \rangle = \langle v, w \rangle for all v,wknv,w \in k^n, then taking v=eiv = e_i and w=ejw = e_j we have (ATA)ij=eiTATAej=eiTej(A^TA)_{ij} = e_i^T A^TA e_j = e_i^T e_j Since ei,ej={1i=j0ij\langle e_i, e_j \rangle = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases} we must have ATA=I.A^T A = I.

As mentioned earlier, in Sage there is a method A.gram_schmidt() for a matrix AA. It returns a decomposition very close to the QR one.

  • First it consumes a matrix AA whose rows are the vectors in the basis.
  • It returns A=Q~LA = \tilde{Q} L where Q~\tilde{Q} has orthogonal rows and LL is lower triangle. The rows here are the uiu_i’s from the Gram-Schmidt as the rows of Q~\tilde{Q}. While LL has 11’s along the diagonal and Li,j:=vi,ujuj,ujL_{i,j} := \frac{\langle v_i, u_j \rangle}{\langle u_j, u_j \rangle} for the entries below the diagonal.

There is a function on lists of vectors in Sage that will perform Gram-Schmidt alone.

sage: B = [vector([1,2,1/5]), vector([1,2,3]), vector([-1,0,0])]
sage: from sage.modules.misc import gram_schmidt
sage: G, mu = gram_schmidt(B)
sage: G
[(1, 2, 1/5), (-1/9, -2/9, 25/9), (-4/5, 2/5, 0)]

The example is taken from its documentation.