QR factorizations

Recall that each elementary row operation in Gaussian Elimination could be interpreted as multiplication by a particular matrix. When thinking this way, you realize you are actually factoring a matrix when running Gaussian Elimination. This realization results in the LU factorizations with pivoting.

Similarly, we can view Gram Schmidt as a factorization, called a QR factorization.

First we notice that we can unfold the recursive definition given for Gram-Schmidt to write: $\begin{aligned} v_1 & = \langle v_1, w_1 \rangle w_1 \\ v_2 & = \langle v_2, w_2 \rangle w_2 + \langle v_2, w_1 \rangle w_1 \\ v_3 & = \langle v_3, w_3 \rangle w_3 + \langle v_3, w_2 \rangle w_2 + \langle v_3,w_1 \rangle w_1 \\ & \vdots \\ v_d & = \langle v_d, w_d \rangle w_d + \langle v_d, w_{d-1} \rangle w_{d-1} + \cdots + \langle v_d, w_1 \rangle w_1 \end{aligned}$

We can combine the $u_i$ as columns of a single matrix $Q$ and we let $R$ be the matrix with $R_{ij} := \langle v_j, w_i \rangle$

The matrix $R$ from its definition is upper triangular. For this reason, the QR decomposition is sometimes called the QU decomposition.

The matrix $Q$ is an orthogonal matrix. Recall from this means the transpose of $Q$ is its inverse: $Q^T = Q^{-1}$

Lemma. A $n \times n$ matrix $A$ is orthogonal if and only if its columns form an orthonormal basis.

Proof. (Expand to view)

Let’s look at $Q^TQ$. The i,j entry of $Q^TQ$ is $q_{1j}q_{1i} + q_{2j}q_{2i} + \cdots + q_{nj}q_{ni} = \mathbf{C}_j^T \mathbf{C}_i$ If we want this to be the identity matrix, it is equivalent to requiring that
$\mathbf{C}_j^T \mathbf{C}_i = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}$ But this is exactly expressing that the vectors $\mathbf{C}_1,\ldots,\mathbf{C}_n$ are an orthonormal set in $\mathbb{R}^n$. Any orthonormal set is linearly independent and we have as many as the dimension of $\mathbf{R}^n$. Thus we have a basis. ■

Knowing this, we have the following.

Theorem(QR Decomposition). For an invertible $n \times n$ matrix $A$ with entries in $\mathbb{R}$, there exists an orthogonal matrix $Q$ and an upper triangular matrix $R$ with $A = QR$

Proof. (Expand to view)

We take $Q$ to have columns given by applying Gram-Schmidt to the columns of $A$ and taking $R$ to be the matrix of pairings between columns of $Q$ and those of $A$ as above. ■

There is another useful characterization of orthogonal matrices. Given an inner product $\langle -,- \rangle$ on a vector space $k^n$, we say that an $n \times n$ matrix $A$ preserves the inner product if $\langle A w, A v \rangle = \langle w, v \rangle$ In other words, applying $A$ to the two inputs and then taking the product returns the same number as just applying the product to the two inputs directly.

Lemma. An $n \times n$ matrix $A$ preserves an the standard inner product on $k^n$ if and only if $A$ is orthogonal.

Proof. (Expand to view)

We have $\langle Av , Aw \rangle = (Av)^T Aw = v^T A^TA w.$ If $A$ is orthogonal, then this is equal to $v^T w = \langle v , w \rangle$.

If we assume $\langle Av, Aw \rangle = \langle v, w \rangle$ for all $v,w \in k^n$, then taking $v = e_i$ and $w = e_j$ we have $(A^TA)_{ij} = e_i^T A^TA e_j = e_i^T e_j$ Since $\langle e_i, e_j \rangle = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}$ we must have $A^T A = I.$ ■

As mentioned earlier, in Sage there is a method A.gram_schmidt() for a matrix $A$. It returns a decomposition very close to the QR one.

First it consumes a matrix $A$ whose rows are the vectors in the basis.
It returns $A = \tilde{Q} L$ where $\tilde{Q}$ has orthogonal rows and $L$ is lower triangle. The rows here are the $u_i$’s from the Gram-Schmidt as the rows of $\tilde{Q}$. While $L$ has $1$’s along the diagonal and $L_{i,j} := \frac{\langle v_i, u_j \rangle}{\langle u_j, u_j \rangle}$ for the entries below the diagonal.

There is a function on lists of vectors in Sage that will perform Gram-Schmidt alone.

sage: B = [vector([1,2,1/5]), vector([1,2,3]), vector([-1,0,0])]
sage: from sage.modules.misc import gram_schmidt
sage: G, mu = gram_schmidt(B)
sage: G
[(1, 2, 1/5), (-1/9, -2/9, 25/9), (-4/5, 2/5, 0)]

The example is taken from its documentation.