1 Motivation : Pseudo time iterations

Author

Mélanie Fournier

Let \(A\) be a non singular square matrix of dimension \(n\geq 1\) and let \(b\in\mathbb{R}^n\). We consider the linear system \(Ay = b\), where \(b \in \mathbb{R}^n\). The system has the unique solution \(y^* = A^{-1}b\). As directly inverting the matrix is a terrible idea, a fundamental problem in numerical analysis is to find numerical methods to solve this. This can be done with the use of direct methods or iterative methods. In this thesis, we consider an iterative method. Consider now the initial value problem(IVP),

\[ y'(t) = Ay(t)- b, \; \; y(0) = y_0, \]

where \(\mathbfit{y}_0\in \mathbb{R}^n\) and \(t\in \mathbb{R}\). We adapt the result below from [1, Ch. 9.5].

Multiplying the equation by \(e^{-At}\), where \(e^{-At}\) is the usual matrix exponential, and rearranging the terms yields

\[ e^{-At}y'(t) - Ae^{-At}y(t) = -e^{-At}b. \]

We recognize on the left hand side the derivative of the product \(e^{-At}y(t)\), and thus, by the fundamental theorem of calculus,

\[ \left[ e^{-Au}y(u)\right]_0^t = \int_0^t -e^{-Au}b \; du. \]

Multiplying by \(I = A^{-1}A\) inside the integral above, we get

\[ e^{-At}y(t) - y(0) = A^{-1} \int_0^t -Ae^{-Au}b \; du, \]

which can be integrated to get

\[ e^{-At}y(t) - y_0 = A^{-1}\left[e^{-At} - b\right]. \]

Multiplying each side by \(e^{At}\) on the left, and rearranging the terms we get an expression for \(y(t)\):

\[ y(t) = e^{At}(y_0 - A^{-1}b) + A^{-1}b. \tag{1.1}\]

Here, we also used the fact that \(e^{At}A^{-1} = A^{-1}e^{At}\). This gives an expression for the solution of the IVP. Since each of those steps can be taken backward , the solution we get is unique. We have thus proved:

Theorem 1.1 Let \(A\) be a non singular, square matrix of dimension \(n\geq 1\), \(b\in\mathbb{R}^n\) a vector, and consider the initial value problem

\[ y'(t) = Ay(t) - b, \; y(0) = y_0, \tag{1.2}\]

where \(t \rightarrow y(t)\) is a function from \(\mathbb{R}\) to \(\mathbb{R}^n\). Then the problem has a unique solution in the form of

\[ y(t) = e^{At}(y_0 - y^*) + y^*, \]

where \(y^* = A^{-1}b\), and \(e^{At}\) is defined using the usual matrix exponential.

Let \(\lambda_1 , \lambda_2 , \dots , \lambda_n\) be the (not necessarily distinct) eigenvalues of \(A\). We write \(\lambda_i = a_i + ib_i\), where \(a_i,b_i \in \mathbb{R}\) and are respectively the real part and the imaginary parts of the \(i^{th}\) eigenvalue. Then, the following results holds[2, Ch. 1]:

Theorem 1.2 \(y(t) \to y^*\) as \(t \to +\infty\) for any initial value \(y_0\) if and only if, for all \(i = 1 , \dots , n\), \(a_i <0\), that is, all the eigenvalues of \(A\) have a strictly negative real part.

We call such matrices stable in the rest of this thesis.

Proof. We restrict ourselves to the diagonalizable case. Assume that \(A\in\mathbb{R}^{n\times n}\) is diagonalizable and let \(\lambda_1,\dots,\lambda_n\) be the eigenvalues of \(A\). Then we can write \(A = PD P^{-1}\) where \(D\) is the diagonal matrix with the eigenvalues of \(A\), and \(P\) is the associated eigenvectors matrix:

\[ D = \begin{pmatrix} \lambda_1 & & &\\ &\lambda_2 & & \\ && \ddots & \\ &&& \lambda_n \end{pmatrix}. \]

Then \(e^{At} = \sum_{i=0}^\infty \frac{(PD P^{-1}t)^i}{i!} = \sum_{i=0}^\infty P \frac{(D t)^i}{i!}P^{-1}\). The \(P\) can be moved outside of the sum to get

\[ e^{At} = P e^{D t}P^{-1}. \]

Since the matrix exponential of a diagonal matrix is simply the matrix of the exponentiated elements, we have \[ e^{D t} = \begin{pmatrix} e^{\lambda_1 t} & & &\\ &e^{\lambda_2 t} & & \\ && \ddots & \\ &&& e^{\lambda_n t} \end{pmatrix}. \]

Let \(z(t) = P^{-1}(y(t)-y^*)\), where \(y(t)\) is the unique solution to Equation eq-IVPth for some arbitrary initial value \(y_0\).

Since \(P\) is non singular, we can use a continuity argument to state that \(y(t) \to y^*\) if and only if \(z(t) \to 0\). We have

\[ z(t) = P^{-1} e^{At}(y_0-y^*). \]

We note that \(P^{-1} e^{At} = e^{\Delta t} P^{-1}\), thus

\[ z(t) = e^{D t} P^{-1}(y_0-y^*). \]

Looking at the \(i^{\text{th}}\) element \(z(t)_i\), we have

\[ |z(t)_i| = |e^{\lambda_i t}|\left[ P^{-1}(y_0-y^*)\right]_i. \]

The only time dependent term is \(|e^{\lambda_i t}| = e^{a_it}\), with \(a_i\) being the real part of \(\lambda_i\), and \(z(t)_i \to 0\) as \(t \to +\infty\) if and only if \(a_i<0\).

If this holds for any \(i = 1, \dots , n\), then \(z(t) \to 0\) as \(t\to +\infty\). This proves the sufficient condition.

This is also a necessary condition. Indeed, since \(y_0\) is arbitrary, we can chose it so that \(P^{-1}(y_0-y^*) = (1, \dots , 1)^T\). Then \(z(t) = (e^{\lambda_1 t}, e^{\lambda_2 t}, \dots , e^{\lambda_n t})^T\) which converges to \(0\) only if all the eigenvalues have a strictly negative real part.

We now go back to the original problem of solving the linear system \(Ay = b\). If all the eigenvalues of \(A\) have a strictly negative real part, then, any numerical solver for the initial value problem \(y'(t) = Ay(t) - b\) with \(y(0) = y_0\), where \(t\) is a pseudo-time variable also becomes an iterative solver for the linear system \(Ay = b\), as \(y(t) \to y^*\).

Remark. The eigenvalues of \(A\) are \(\lambda_1, \dots , \lambda_n\). If all these eigenvalues have a strictly positive real part, then the eigenvalues of \(-A\), which are \(-\lambda_1, \dots , -\lambda_n\), have a strictly negative real part. Therefore, \(-A\) is stable and to solve the linear problem \(Ay = b\), we can simply consider the IVP \(y' = (-A)y - (-b) = -Ay+b\) instead, with our best guess of \(y^*\) as the initial value.

[1]

W. A. Adkins, M. G. Davidson, and S. (Online service), Ordinary differential equations. in Undergraduate texts in mathematics,. New York, NY : Springer New York :, 2012. Available: http://dx.doi.org/10.1007/978-1-4614-3618-8

[2]

Bellman Richard Ernest, Stability theory of differential equations /. New York : McGraw-Hill, 1953.