How does gmres method iteration behave for this non-diagonalizable matrix?

Question

Recently, I have been studied my lessons about gmres iteration, probably the most popular iteration method for general large sparse linear system of equations Ax=b. And the convergence is obtained under the condition of which the matrix A is diagonalizable, but in real applications, this is seldom the case where such a good diagonalizable matrix A. So, is there any other convergence results for GMRES iteration? or in other words, when we choose a preconditioner M for solving $$M^{-1}Ax=M^{-1}b,$$ which aspects should we consider to guarantee the better convergence for this preconditioned GMRES? Because $M^{-1}A$ can not be diagonalizable in general case, is it enough to cluster the spectral distributions of the $M^{-1}A$? What aspects of the preconditioned matrix $M^{-1}A$  should we discuss?

A strange example is as follows:
$$
A=left(begin{array}{cccccc}{1} & {c_{1}} & {} & {} & {} & {}  {0} & {1} & {c_{2}} & {} & {} & {}  {} & {} & {ddots} & {ddots} & {} & {}  {} & {} & {} & {1} & {c_{n-2}} & {}  {} & {} & {} & {} & {1} & {c_{n-1}}  {} & {} & {} & {} & {} & {1}end{array}right)inmathbb{R}^{ntimes n},$$
with $c_ineq 0$, we set $b=A*[1, 1,...,1]^T$ to solve this system $Ax=b$ using matlab command gmres. And we compare this result with matlab Ab, but gmres fails (my CPU memory is 8GB), Ab instead succeed. I just cannot understand why this happened, because we often say that for large sparse matrix, direct method can be beaten by iterative method. Can you give me some suggestions about gmres convergence and some explanations about the above example Ax=b, because all the eigenvalues of matrix A is unit, theoretically, gmres will converges very fast. Thanks very much.

Below is my simple test matlab code

%   just a simple test
clc;clear;
rng(0)% fix the random number
n=1e+5;
c = rand(n,1);% c_i
A = spdiags([ones(n,1)  c],[0 1],n,n);
b = A*ones(n,1);
%   mdirect method
Ab;
%   iterative method
x=gmres(A,b,[],[],n);

Federico Poloni · Accepted Answer

Unfortunately, convergence of GMRES does not have a clear dependence on the distribution of eigenvalues.

It was proved by Greenbaum, Ptak and Strakos in 1996 that you can construct examples with an arbitrary spectrum and an arbitrary convergence history: that is, give me any $n$ nonzero complex numbers, and any decreasing sequence $|r_k|$, and I can construct a matrix $A$ with that spectrum and a vector $b$ so that gmres(A, b) achieves those residual norms at each step. Essentially everything you heard about the relationship between spectrum and GMRES convergence can fail spectacularly.

Some more references here 
http://www.cs.cas.cz/duintjertebbens/Talks/Liblice12.pdf .

Reid.Atcheson · Answer

Estimates for GMRES convergence based on eigenvalue distribution often implicitly assume that the matrix is normal. Sometimes the convergence rate is still provable in an asymptotic sense in the non-normal case, but if the matrix is severely non-normal then the "pre-asymptotic" behavior will make such convergence rates never reachable in practice.

Your matrix is not normal, and its non-normality gets worse as the size grows.

Sen · Answer

Well, I can only propose a potential cause for why GMRES method fails for the problem you showed. I don't have enough reputation so I can't comment.

Since GMRES is using Anorldi Process to generate a set of orthogonal vectors in the Krylov subspace, if the matrix itself is very large, that means the Q and H matrix produced could be very large. So if you are solving a very large problem, you need to use the limited-GMRES (or limited-memory GMRES, can't recall the name exactly). But the key point is that the solver will reset the orthogonal matrix and restart the entire process to handle the memory limitation. This is probably the reason why your GMRES failed. Check if there is a limited option in the MatLab command.

Ab works straight away since this is a very sparse and bi-diagonal system, which is simply perfect for backward elimination.

Hope this helps. Cheers!

denis · Answer

First some comments on why such matrices are hard for solvers.
Notice that if $U$ is upper diagonal like
[[. 2 . . .]
 [. . 2 . .]
 [. . . 2 .]
 [. . . . 2]
 [. . . . .]]

then $(I - U)^{-1} = I + U + U^2 + ... U^{n-1}  (U^n = 0)$.
For this example, $(I - U)^{-1} =$
[[ 1  2  4  8 16]
 [ .  1  2  4  8]
 [ .  .  1  2  4]
 [ .  .  .  1  2]
 [ .  .  .  .  1]]

has corner $2^{blksize - 1}$.
Huge entries will cause trouble for any solver --
either floating-point overflow or underflow, or extreme ill-conditioning.
With $U$ random-Cauchy instead of random-normal (to amplify the problem),
the singular values of $I - U   300 times 300$ are
[6.7e-18 1.2e-06 2.1e-05 0.0025 0.0026 0.005 0.0063 0.0068 0.0083 0.011 ... 18 22 22 25 30 32 35 37 45 610]

and the pseudo-inverse is not triangular.
(This is with python numpy and scipy, I don't have Matlab.)

Do Krylov methods work for at all for matrices which cannot be diagonalized,
such as these $I - U$ with eigenvalues all 1 ?
What-is-the-principle-behind-the-convergence-of-krylov-subspace-methods-for-solving linear systems of equations ?,
says clearly "IF A is diagonalizable ...".
But if not, how can one go about finding a polynomial for which
$p(A) , b approx A^{-1} , b$ ?
No idea -- over to experts.
See also the many papers by Trefethen et al. and the book
Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators
2005, 606 pages.

How does gmres method iteration behave for this non-diagonalizable matrix?

4 Answers

Add your own answers!

Ask a Question