Projection Methods⚓︎
General framework for Projection Methods⚓︎
Given two subspaces \(\mathbf{L},\mathbf{K}\), and an initial guess for the solution \(\boldsymbol{x}^{\left( 0 \right)}\), we want to find \(\boldsymbol{\delta }\in \mathbf{K}\) such that \(\boldsymbol{x}=\boldsymbol{x}^{\left( 0 \right)}+\boldsymbol{\delta }\) produces a residual \(\boldsymbol{r}=\boldsymbol{b}-\boldsymbol{Ax}\) that is orthogonal to \(\mathbf{L}\).
Note that:
Assume \(\boldsymbol{x}\in \mathbb{R} ^m\left( n\leqslant m \right)\). Let \(\left\{ \boldsymbol{v}_i \right\} _{i=1}^{n}\) be a basis of \(\mathbf{K}\), and let \(\left\{ \boldsymbol{w}_i \right\} _{i=1}^{n}\) be a basis of \(\mathbf{L}\). Note that we assume \(\mathrm{dim}\left( \mathbf{K} \right) =\mathrm{dim}\left( \mathbf{L} \right)\) here. Then:
If we denote:
Then:
For \(\boldsymbol{\delta }\in \mathbf{K}\), we have:
We get:
Then:
Therefore, if \(\boldsymbol{W}^T\boldsymbol{AV}\) is invertible (note that we have assumed the dimensions of \(\mathbf{K}\) and \(\mathbf{L}\) are the same before), we get:
where \(\boldsymbol{W}^T\boldsymbol{AV}\) is a \(n\times n\) matrix (in many cases, we select \(n\) as a small number; sometimes even \(n=1\)). Then:
Algorithm
( Projection Algorithm ):
Until convergence, do:
- Select a pair of subspaces \(\mathbf{K},\mathbf{L}\);
- Choose bases \(\boldsymbol{V}=\left[ \begin{matrix} \boldsymbol{v}_1& \cdots& \boldsymbol{v}_n\\ \end{matrix} \right] , \boldsymbol{W}=\left[ \begin{matrix} \boldsymbol{w}_1& \cdots& \boldsymbol{w}_n\\ \end{matrix} \right]\);
- \(\boldsymbol{r}=\boldsymbol{b}-\boldsymbol{Ax}\);
- \(\boldsymbol{y}=\left( \boldsymbol{W}^T\boldsymbol{AV} \right) ^{-1}\boldsymbol{W}^T\boldsymbol{r}\);
- \(\boldsymbol{x}=\boldsymbol{x}+\boldsymbol{Vy}\);
End
Example: Gauss-Seidel Method⚓︎
Example
: Gauss-Seidel is a Projection Method by taking \(\mathbf{K}=\mathbf{L}=\mathrm{span}\left\{ \boldsymbol{e}_i \right\}\) (standard basis). In this case:
And:
Note that we require:
We end up with Gauss-Seidel iteration.
Discussion on Projection Method⚓︎
Now, consider the projection method again:
The algorithm can be continued (does not necessarily mean convergence) if \(\boldsymbol{W}^T\boldsymbol{AV}\) is invertible.
Question
: How do we ensure that \(\boldsymbol{W}^T\boldsymbol{AV}\) is invertible?
If \(\boldsymbol{A}\) is invertible, it is not necessarily true that \(\boldsymbol{W}^T\boldsymbol{AV}\) is also invertible. For example, consider:
Theorem
: Let \(\boldsymbol{A},\mathbf{L},\mathbf{K}\) satisfy either one of the following conditions:
- \(\boldsymbol{A}\) is SPD and \(\mathbf{L}=\mathbf{K}\);
- \(\boldsymbol{A}\) is invertible and \(\mathbf{L}=\boldsymbol{A} \mathbf{K}\).
Then the matrix \(\boldsymbol{W}^T\boldsymbol{AV}\) is nonsingular for any \(\boldsymbol{V}, \boldsymbol{W}\) of \(\mathbf{K},\mathbf{L}\) respectively.
Proof
: \(\mathbf{L}=\boldsymbol{A} \mathbf{K}\) means that \(\forall \boldsymbol{v}\in \mathbf{K}, \boldsymbol{Av}\in \mathbf{L}\), and \(\forall \boldsymbol{u}\in \mathbf{L}\), there exists \(\boldsymbol{v}\in \mathbf{K}\) such that \(\boldsymbol{u}=\boldsymbol{Av}\).
First situation: Since \(\mathbf{L}=\mathbf{K}\), and \(\boldsymbol{V},\boldsymbol{W}\) are the two bases, then \(\boldsymbol{W}=\boldsymbol{VG}\), where \(\boldsymbol{G}\) is the change of basis matrix ( \(\boldsymbol{G}\) is invertible). We can find out that:
Then \(\boldsymbol{W}^T\boldsymbol{AV}\) is invertible.
Second situation: Pick \(\mathbf{L}=\boldsymbol{A} \mathbf{K}\), then \(\boldsymbol{AV}\) is a basis for \(\mathbf{L}\). Since \(\boldsymbol{W}\) is also a basis for \(\mathbf{L}\), there exists a change of basis matrix \(\boldsymbol{G}\) (invertible) such that \(\boldsymbol{AVG}=\boldsymbol{W}\). Then:
\(\boldsymbol{A}^T\boldsymbol{A}\) is SPD because \(\boldsymbol{A}\) is nonsingular. Since \(\boldsymbol{V}^T\boldsymbol{A}^T\boldsymbol{AV}\) and \(\boldsymbol{G}^T\) are invertible, we can find that \(\boldsymbol{W}^T\boldsymbol{AV}\) is invertible.
Therefore, the projection method can be continued under one of these two conditions. End of proof.
Example
: Assume \(\boldsymbol{A}\) is SPD, pick \(\mathbf{L}=\mathbf{K}\). Let \(\mathbf{K}=\mathbf{L}=\mathrm{span}\left\{ \boldsymbol{r}^{\left( j \right)},\boldsymbol{p}^{\left( j-1 \right)} \right\}\) where \(\boldsymbol{p}^{\left( j-1 \right)}\) is the search direction in the previous step and \(\boldsymbol{r}^{\left( j \right)}\) is the current residual. This is how Conjugate Gradient Method (CG) is formed.