10 Numerics II: Multistep methods

The Euler and trapezoidal rule methods of the previous chapter use one previous approximation \(y_{n}\) to compute the next approximation \(y_{n+1}.\) Multistep methods aim to increase accuracy by using several previous approximations to compute the next approximation. As we will see, some multistep methods achieve this aim.16 But multistep methods also exhibit a very important phenomenon that we have not seen so far: there are multistep methods whose local error is always small for sufficiently small step-size \(h,\) but whose global error is typically catastrophically large, even for very small step-size. In the last chapter we introduced the concept of order of a method to capture the behavior of error locally, whereby order \(p\ge1\) means that the local error converges to zero as \(h\downarrow0.\) A method that is of order at least \(p=1\) is called consistent (this useful terminology was not yet used in the last chapter, though it probably should have been). We also introduced convergence as the formal property of a method that expresses that global error is guaranteed to be well-behaved, tending to zero as \(h\downarrow0.\) Both Euler and trapezoidal rule methods were proved to be consistent i.e. locally “good” (Lemma 9.6 and Theorem 9.11), and also convergent, i.e. globally “good” (Theorem 9.8 and Theorem 9.13). But it is unfortunately not true that every “locally good” method is also “globally good”. Indeed in this chapter section we will see some multistep methods that are consistent but not convergent, i.e. “locally good” but “globally (very) bad”.

Luckily, there is a simple condition which does guarantee convergence of a consistent multistep method. It will be introduced in Section 10.4.

\(\text{ }\)Notation for steps. A one step method uses the previous approximation \(y_{m}\) to produce the next approximation \(y_{m+1},\) and then \(y_{m+1}\) to produce \(y_{m+2},\) and so on, i.e. it has the form \[y_{n+1}=Y(F,t_{n},y_{n}),\quad n\ge0.\] A multistep method with two steps uses the two previous approximations \(y_{n},y_{n+1}\) to produce the next approximation \(y_{n+2},\) and then \(y_{n+1},y_{n+2}\) to produce \(y_{n+3},\) and so on, i.e. it has the form \[y_{n+2}=Y(F,t_{n},y_{n},y_{n+1}),\quad n\ge0.\] A multistep method with \(s\ge1\) steps uses the \(s\) previous approximations \(y_{n},y_{n+1},\ldots,y_{n+s-1}\) to produce the next approximation \(y_{n+s},\) and then the \(s\) approximations \(y_{n+1},\ldots,y_{n+s}\) to produce \(y_{n+s+1},\) and so on, i.e. it has the form \[y_{n+s}=Y(F,t_{n},y_{n},y_{n+1},\ldots,y_{n+s-1}),\quad n\ge0.\]

\(\text{ }\)Starting multistep methods from only \(y_{0}.\) The rule of an \(s\)-step method can only be used if one already has the \(s\) approximations \(y_{0},\ldots,y_{s-1}.\) When numerically solving a standard IVP \[\dot{y}=F(t,y),\quad\quad y(t_{0})=y_{0},\] one only has the one initial \(y_{0}\) to begin with. The solution is simply to use one or more methods with fewer steps to obtain \(y_{1},\ldots,y_{s-1},\) and after that point switch to the chosen \(s\)-step method.

10.1 Adams-Bashforth methods

We first study the Adams-Bashforth multistep methods, which can be thought of as exploiting polynomial interpolation to extract more accuracy from multiple steps.

\(\text{ }\)Starting point. As for Euler and trapezoidal rule methods, the starting point is the identity \[y(t_{n+s})-y(t_{n+s-1})=\int_{t_{n+s-1}}^{t_{n+s}}F(t,y(t))dt,\tag{10.2}\] which any solution of the ODE \(\dot{y}=F(t,y)\) must satisfy. This is combined with an approximation of the definite integral on the r.h.s. of ([eq:ch9_a_d_step_integral]). The Euler method arises from applying the approximation \(\int_{t_{n}}^{t_{n+1}}f(t)dt\approx hf(t_{n})\) to r.h.s. of ([eq:ch9_a_d_step_integral]) (in the case \(s=1\)), while the trapezoidal rule method arises from applying \(\int_{t_{n}}^{t_{n+1}}f(t)dt\approx\frac{h}{2}(f(t_{n})+f(t_{n+1})).\) Similarly a two-step method can be obtained from an approximation of \(\int_{t_{n+1}}^{t_{n+2}}f(t)dt\) in terms of \(f(t),t\in\{t_{n,}t_{n+1}\},\) a three-step method from an approximation of \(\int_{t_{n+2}}^{t_{n+3}}f(t)dt\) in terms of \(f(t),t\in\{t_{n,}t_{n+1},t_{n+2}\},\) and so on for any number \(s\ge1\) of steps. To keep notation light, we present the three-step case before giving the general \(s\)-step formulas.

\(\text{ }\)Polynomial interpolation of \(f.\) The values of a function \(f(t)\) in the interval \([t_{n+2},t_{n+3}]\) can be approximated by constructing an interpolating polynomial \(p(t)\) such that \(p(t)=f(t)\) for \(t\in\left\{ t_{n},t_{n+1},t_{n+2}\right\} .\) The integral can then be approximated as \[\int_{t_{n+2}}^{t_{n+3}}f(t)dt\approx\int_{t_{n+2}}^{t_{n+3}}p(t)dt.\tag{10.3}\] To construct the interpolation for three points we use the polynomials \[q_{0}(t)=(t-t_{n+1})(t-t_{n+2}),\,\,\,q_{1}(t)=(t-t_{n})(t-t_{n+2}),\,\,\,q_{2}(t)=(t-t_{n})(t-t_{n+1})\] and \[p_{0}(t)=\frac{q_{0}(t)}{q_{0}(t_{n})},\quad p_{1}(t)=\frac{q_{1}(t)}{q_{1}(t_{n+1})},\quad p_{2}(t)=\frac{q_{2}(t)}{q_{2}(t_{n+2})},\] for which \[p_{m}(t_{n+m'})=\begin{cases} 1 & \text{ if }m'=m,\\ 0 & \text{ if }m'\ne m, \end{cases}\quad\quad\text{for}\quad\quad m,m'\in\left\{ 0,1,2\right\} .\] Then the polynomial \[p(t)=p_{0}(t)f(t_{n})+p_{1}(t)f(t_{n+1})+p_{2}(t)f(t_{n+2})\] satisfies \(p(t)=f(t)\) for \(t\in\left\{ t_{n},t_{n+1},t_{n+2}\right\} ,\) so it interpolates the three data-points \((t,f(t)),t\in\{t_{n},t_{n+1},t_{n+2}\},\) and \(p(t)\) is an approximation of \(f(t)\) for other \(t.\) It is in fact the Lagrange interpolating polynomial for these data-points17, and the \(p_{i}(t)\) are the Lagrange basis polynomials for the three “nodes” \(t_{n},t_{n+1},t_{n+2}.\)

\(\text{ }\)Deriving the Adams-Bashforth rule. For this \(p(t)\) the integral on the r.h.s. of ([eq:ch9_multistep_ab_three_steps_int_approx]) equals \[\int_{t_{n+2}}^{t_{n+3}}p(t)dt=\tilde{b}_{0}f(t_{n})+\tilde{b}_{1}f(t_{n+1})+\tilde{b}_{2}f(t_{n+2}),\] for \[\tilde{b}_{m}=\int_{t_{n+2}}^{t_{n+3}}p_{m}(t)dt,\quad\quad m=0,1,2.\] The change variables \(t=t_{n}+r\) maps \(\{t_{n},t_{n+1},t_{n+2}\}\) to \(\{0,h,2h\}\) and yields the identities \[\tilde{b}_{0}=\int_{2h}^{3h}\frac{(r-h)(r-2h)}{(0-h)(0-2h)}dr,\quad\tilde{b}_{1}=\int_{2h}^{3h}\frac{(r-0)(r-2h)}{(h-0)(h-2h)}dr,\quad\tilde{b}_{2}=\int_{2h}^{3h}\frac{(r-0)(r-h)}{(2h-0)(2h-h)}dr.\] These show that the \(\tilde{b}_{m}\) don’t depend on \(n.\) The dependence on \(h\) can also be “factored out” by applying another change of variables \(r=2h+hu,\) yielding \(\tilde{b}_{i}=hb_{i}\) for \[b_{0}=\int_{2}^{3}\frac{(u-1)(u-2)}{(0-1)(0-2)}du,\quad b_{1}=\int_{2}^{3}\frac{(u-0)(u-2)}{(1-0)(1-2)}dr,\quad b_{2}=\int_{2}^{3}\frac{(u-1)(u-2)}{(3-1)(3-2)}du,\] which do not depend on \(h.\) Explicitly \[b_{0}=\frac{1}{(-1)(-2)}\int_{2}^{3}(u-1)(u-2)du=\frac{1}{2}\int_{2}^{3}\left(u^{2}-3u+2\right)du=\frac{1}{2}\left(\frac{u^{3}}{3}-u^{2}\right)|_{2}^{3}=\frac{5}{12}.\] Similarly, one can verify that \(b_{1}=-\frac{4}{3}\) and \(b_{2}=\frac{23}{12}.\) Thus we have arrived at the approximation \[\int_{t_{n+2}}^{t_{n+3}}f(t)dt\approx h\left(b_{2}f(t_{n+2})+b_{1}f(t_{n+1})+b_{0}f(t_{n})\right)=h\left(\frac{23}{12}f(t_{n+2})-\frac{4}{3}f(t_{n+1})+\frac{5}{12}f(t_{n})\right).\] Applying it for the r.h.s. of ([eq:ch9_a_d_step_integral]) we arrive at the rule \[y_{n+3}=y_{n+2}+h\left(\frac{23}{12}F(t_{n+2},y_{n+2})-\frac{4}{3}f(t_{n+1},y_{n+1})+\frac{5}{12}f(t_{n},y_{n})\right).\] This is the three step \((s=3)\) Adams-Bashforth method.

For a general number \(s\ge1\) of steps one uses the Lagrange polynomials for \(s\) nodes \[p_{m}(t)=\frac{q_{m}(t)}{q_{m}(t_{n+m})},\quad\text{for}\quad q_{m}(t)=\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}(t-t_{n+m'}),\quad m=0,\ldots,s-1,\] yielding coefficients \[b_{m}=\frac{\int_{s-1}^{s}\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}\left(u-m'\right)du}{\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}\left(m-m'\right)},\quad m=0,\ldots,s-1,\tag{10.4}\] and the \(s\)-step Adams-Bashforth rule \[y_{n+s}=y_{n+s-1}+h\sum_{m=0}^{s-1}b_{m}F(t_{n+m},y_{t+m}).\] For \(s=1\) the only coefficient is \(b_{0}=1,\) so one recovers the Euler method. For \(s=2\) the coefficients are \(b_{0}=-\frac{1}{2}\) and \(b_{1}=\frac{3}{2},\) so the rule is \[y_{n+2}=y_{n+1}+h\left(\frac{3}{2}F(t_{n+1},y_{n+1})-\frac{1}{2}F(t_{n},y_{n})\right).\]

\(\text{ }\)Local error/order of Adams-Bashforth rule. The local error of the method is the error in the approximation of \(y(t_{n+s})\) if one uses the exact values for the previous steps \(y(t_{n}),\ldots,y(t_{n+s-1}).\) That approximation is \[y(t_{n+s-1})+\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt\] for \[p(t)=p_{0}(t)f(t_{n})+\ldots+p_{s-1}(t)f(t_{n+s-1}),\quad\text{where}\quad f(t)=F(t,y(t)).\] Thus the error is \[\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt-\int_{t_{n+s-1}}^{t_{n}}f(t)dt\tag{10.5}\] The Lagrange remainder formula bounds the error in the approximation \(f(t)\approx p(t).\)

Lemma 10.1

Let \(I\) be a closed interval, and \(f:I\to\mathbb{R}\) an \(s\)-times continuously differentiable function. Let \(r_{1},\ldots,r_{s}\in I\) be distinct, and let \(p(t)\) be the unique degree \(s-1\) polynomial such that \(p(r_{i})=f(r_{i})\) for all \(i.\) Then for every \(t\in I,\) there is a \(\xi\in I\) such that \[f(t)-p(t)=\frac{f^{(s)}(\xi)}{s!}\prod_{i=1}^{s}(t-r_{i}).\tag{10.6}\]

With \(t\in[t_{n+s-1},t_{n+s}]\) and \(r_{1},\ldots,r_{s}=t_{n},\ldots,t_{n+s-1}\) the r.h.s. of ([eq:ch10_lagrange_remainder_formula]) is \(O(h^{s}\)), showing that \(f(t)=p(t)+O(h^{s})\) for \(t\in[t_{n+s-1},t_{n+s}].\) Using the latter in ([eq:ch10_a_d_loc_err]) yields \[\left|\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt-\int_{t_{n+s-1}}^{t_{n}}f(t)dt\right|\le\int_{t_{n+s-1}}^{t_{n+s}}\left|p(t)-f(t)\right|dt=O\left(h^{s}\int_{t_{n+s-1}}^{t_{n+s}}1dt\right)=O(h^{s+1}).\] Thus ([eq:ch10_a_d_loc_err]) is \(O(h^{s+1}\)), which means that the \(s\)-step Adams-Bashforth method is of order \(s\)! We have proved the following lemma.

Lemma 10.2

The Adams-Bashforth method with \(s=1,2,\ldots\) steps has order \(p=s.\)

Note that the Adams-Bashforth methods are all explicit. Thus the two-step Adams-Bashforth matches the order of the trapezoidal rule method, without being implicit, and the three-step Adams-Bashforth is of higher order than both.

10.2 General linear multistep methods

In this section we introduce a general class of multistep method, of which Adams-Bashforth is a special case. We also present a simple criterion to check if a specific method in the class is consistent (i.e. of order \(p\) for some \(p\ge1\)).

\(\text{ }\)General class of linear multistep methods. The formula \[\sum_{m=0}^{s}a_{m}y_{n+m}=h\sum_{m=0}^{s}b_{m}F(t_{n+m},y_{n+m})\tag{10.8}\] for arbitrary coefficients \(a_{m},b_{m}\) defines a general class of multistep methods. Adams-Bashforth corresponds to \(a_{s}=1,a_{s-1}=-1\) and \(a_{i}=0\) for all other \(i,\) and \(b_{s}=0\) with \(b_{0},\ldots,b_{s-1}\) given by ([eq:ch9_a_d_s_step_b_m_def_as_int]). If \(b_{s}\ne0\) then a multiple of \(hF(t_{n+s},y_{n+s})\) appears on the r.h.s., which makes the method implicit, like the trapezoidal rule method. The rule ([eq:ch9_general_multistep]) is then understood as an implicit equation for \(y_{n+s}\) for given \(y_{n},\ldots,y_{n+s-1}.\) As already mentioned the Adams-Bashforth methods are explicit, and indeed they have \(b_{s}=0.\)

One can “encode” the coefficients \(a_{m},b_{m}\) in the generating polynomials \[\rho(w)=\sum_{m=0}^{s}a_{m}w^{m}\quad\quad\text{and}\quad\quad\sigma(w)=\sum_{m=0}^{s}b_{m}w^{m},\tag{10.9}\] so that every method of the type ([eq:ch9_general_multistep]) can be specified either by the formula ([eq:ch9_general_multistep]) or by writing down \(\rho(w),\sigma(w).\) The \(s\)-step Adams-Bashforth method corresponds to \[\rho(w)=w^{s}-w^{s-1}\quad\quad\text{and}\quad\quad\sigma(w)=\begin{cases} 1 & \text{ for }s=1,\\ \frac{3}{2}w-\frac{1}{2} & \text{ for }s=2,\\ \frac{23}{12}w^{2}-\frac{4}{3}w+\frac{5}{12} & \text{ for }s=3,\\ \ldots & \ldots \end{cases}\tag{10.10}\] (Some authors call them characteristic polynomials, but we avoid this term to not cause confusion with the other kinds of characteristic polynomials already introduced in this course). The generating polynomials ([eq:ch10_general_multistep_rho_sigma]) are also extremely useful tools to study the properties of a linear multistep method. Our first such result is the following theorem, which gives a criterion for determining the order of a general multistep method ([eq:ch9_general_multistep]) from the generating polynomials.

Theorem 10.3

The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]

Remark 10.4

By exactly of order \(p\) we mean that \(p\) is the highest \(p'\) such that the method is of order \(p'\) - or equivalently that it is of order \(p\) but not of order \(p+1.\)

For the proof we use some intermediate lemmas. Let \[A_{k}=\sum_{m=0}^{s}a_{m}m^{k},\quad\quad B_{k}=k\sum_{m=0}^{s}b_{m}m^{k-1}.\tag{10.12}\]

Lemma 10.5

It holds for all \(z\in\mathbb{R}\) that \[\rho(e^{z})-\sigma(e^{z})z=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}(A_{k}-B_{k}).\tag{10.13}\]

Proof. Since the power series of \(z\to e^{zm}\) has infinite radius of convergence for each \(m\) we obtain \[\rho(e^{z})=\sum_{m=0}^{s}a_{m}\sum_{k=0}^{\infty}\frac{m^{k}z^{k}}{k!}=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}A_{k},\] and \[\sigma(e^{z})z=z\sum_{m=0}^{s}b_{m}\sum_{k=0}^{\infty}\frac{m^{k}z^{k}}{k!}=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}B_{k}.\] This proves the claim.

Corollary 10.6

The condition ([eq:ch10_multistep_order_cond]) holds iff \[A_{k}=B_{k}\text{ for }k=0,\ldots,p\text{ and }A_{p+1}\ne B_{p+1}.\tag{10.14}\]

Proof. The change of variables \(w=e^{z}\) and the fact that \(e^{z}-1=z+O(z^{2})\) shows that ([eq:ch10_multistep_order_cond]) is equivalent to the \[\begin{array}{c} \text{the existence of constants }c'\ne0,r'>0\text{ such that}\\ \rho(e^{z})-\sigma(e^{z})z=c'z^{p+1}+O(z^{p+2})\quad\quad\text{for}\quad\quad\left|z\right|\le r'. \end{array}\tag{10.15}\] The equivalence of ([eq:ch10_multistep_order_cond_Ak_Bk]) and ([eq:ch10_multistep_order_cond_ez]) then follows from the identity ([eq:ch10_multistep_order_method_rho_sigma_ident]).

Lemma 10.7

For any analytic \(F(t,y),\) analytic solution \(y(t)\) of \(\dot{y}=F(t,y),\) and \(t_{0}\) there is a constant \(r>0\) such that if \(0<h\le r\) then \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right).\tag{10.16}\]

Proof. Since \(y\) is analytic its power series has a positive radius of convergence \(r'''>0\) at \(t_{0},\) and if \(h\) is such that \(t_{s}-t_{0}<r''',\) then it holds that \(y(t)\) can be expanded as \[y(t_{m})=\sum_{k=0}^{\infty}\frac{(t_{m}-t_{0})^{k}}{k!}y^{(k)}(t_{0})=\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k)}(t_{0}),\tag{10.17}\] and \(y'(t_{m})\) can be expanded as \[y'(t_{m})=\sum_{k=0}^{\infty}\frac{(t_{m}-t_{0})^{k}}{k!}y^{(k+1)}(t_{0})=\sum_{k=1}^{\infty}\frac{m^{k-1}h^{k-1}}{(k-1)!}y^{(k)}(t_{0}).\tag{10.18}\] Using ([eq:ch10_multistep_order_cond_y_taylor_exp]) gives \[\sum_{m=0}^{s}a_{m}y(t_{m})=\sum_{m=0}^{s}a_{m}\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k)}(t_{0})=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})A_{k},\] and using ([eq:ch10_multistep_order_cond_y_prime_taylor_exp]) gives \[h\sum_{m=0}^{s}b_{m}y'(t_{m})=h\sum_{m=0}^{s}b_{m}\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k+1)}(t_{0})=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})B_{k}.\] Thus it holds that \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right).\]

Proof of Theorem 10.3. We prove that a multistep method ([eq:ch9_general_multistep]) being of order \(p\) but not of order \(p+1\) is equivalent to ([eq:ch10_multistep_order_cond_Ak_Bk]). The claim the follows form the intermediate results above.

\(\implies\)Assume that a multistep method ([eq:ch9_general_multistep]) is of order \(p\) but not of order \(p+1.\) Since it is of order \(p\) it must hold for all analytic \(F\) with analytic solution \(y\) to \(y'(t)=F(t,y)\) and \(t_{0}\) that \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=O(h^{p+1}).\] By ([eq:ch10_A_k_B_k_identity_y]) it therefore holds for all such \(y\) that \[\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)=O(h^{p+1}).\tag{10.19}\] Assume \(A_{k}\ne B_{k}\) for some \(k\in\{0,1,\ldots,p\}.\) Consider the particular case \(y'(t)=y(t),t\ge0,y(t_{0})=1\) (i.e. \(F(t,y)=y\)) which has unique exact solution \(y(t)=e^{t}.\) Then \(y^{(k)}(t_{0})=1\) for all \(k,\) so the term with \(h^{k}\) on the l.h.s. of ([eq:ch10_multistep_order_proof_AkBk]) is non-vanishing. But then the l.h.s. can not be \(O(h^{p+1}\)). Thus if the method ([eq:ch9_general_multistep]) is of order \(p,\) it implies that \(A_{k}=B_{k}\) for \(k\in\{0,\ldots,p\}.\)

Assume now that the method is not of order \(p+1.\) This means that there exist \(F,y,t_{0},y_{0}\) (with \(F,y\) analytic etc.) and constants \(c,r>0\) such that \[\left|\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)\right|\ge c'h^{p+2}\quad\quad\text{ for }\quad\quad0<h\le r.\tag{10.20}\] Assume for contradiction that \(A_{k}=B_{k}\) for \(k\le p+1.\) Note that then \[\begin{array}{l} \left|\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)\right|\\ \le\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+2}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+2}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|. \end{array}\] Since \(y\) is analytic, the two infinite sums on the r.h.s. are absolutely convergent. But this implies that they are \(O(h^{p+2}\)), which contradicts ([eq:order_thm_for_contradiction]). Thus if a method is of order \(p\) but not of order \(p+1,\) then \(A_{p+1}\ne B_{p+1}.\)

This implies that if a multistep method is of order \(p\) but not of order \(p+1,\) then ([eq:ch10_multistep_order_cond_Ak_Bk]) holds.

\(\impliedby\) Assume now that ([eq:ch10_multistep_order_cond_Ak_Bk]) holds. Similarly to above, for any \(F,y,t_{0},y_{0}\) (with \(F,y\) analytic etc.) it holds that \[\begin{array}{l} \left|\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))\right|\\ \le\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+1}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+1}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|\\ =O(h^{s+1}), \end{array}\] showing that the method is of order (at least) \(p=s.\)

Also since \(A_{k}=B_{k}\) for \(k=0,\ldots p\) but \(A_{p+1}\ne B_{p+1}\) it holds for the particular \(y'(t)=y(t),t\ge0,y(t_{0})=1\) (i.e. \(F(t,y)=y\)) with exact solution \(y(t)=e^{t},\) that \[\begin{array}{l} \left|\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))\right|\\ \ge|A_{p+1}-B_{p+1}|\frac{h^{p+1}}{(p+1)!}\left|y^{(k)}(t_{0})\right|\\ -\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+2}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+2}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|\\ \ge ch^{p+1} \end{array}\] for a constant \(c\) for all small enough \(h.\) This implies that the method is not of order \(p+1.\)

We have proved that ([eq:ch10_multistep_order_cond_Ak_Bk]) implies that the multistep method is of order \(p\) and not order \(p+1.\)

To check if the criterion ([eq:ch10_multistep_order_cond_ez]) holds in practice one expands \(\rho(w)-\sigma(w)\ln w\) as a power series around \(w=1\) and determines its first non-vanishing term. It is convenient to use the change of variables \(w=1+\xi\) and expand \(\rho(1+\xi)-\sigma(1+\xi)\ln(1+\xi)\) in \(\xi\) around \(\xi=0.\) Since \(\xi\to\ln(1+\xi)\) has a power series with radius of convergence \(1\) and \(\rho,\sigma\) are polynomial the l.h.s. also has a power series with radius of convergence \(1,\) i.e. there are \(c_{0},c_{1},\ldots\) such that \[\rho(1+\xi)-\sigma(1+\xi)\ln(1+\xi)=\sum_{k\ge0}c_{k}\xi^{p}\quad\text{for}\quad\left|\xi\right|<1.\tag{10.21}\] Note that ([eq:ch10_multistep_order_cond_ez]) holds iff \(c_{0}=\ldots=c_{p}=0,c_{p+1}\ne0.\) In other words if the lowest order non-vanishing term in ([eq:order_cond_taylor_in_xi]) is a multiple of \(\xi^{p+1},\) then the method has order exactly \(p.\) Note that one must only compute the coefficients \(c_{k}\) up to the first non-vanishing one, the complete power series is not needed. For the computation one uses the well-known power series for \(\ln(1+\xi)\) around \(\xi=0\): \[\ln(1+\xi)=\xi-\frac{1}{2}\xi^{2}+\frac{1}{3}\xi^{3}+\ldots.\]

Example 10.8 • Order of Euler’s method Euler’s method corresponds to \(s=1\) and \(\rho(w)=w-1\) and \(\sigma(w)=1.\) Written in terms of \(\xi\) these are \(p(1+\xi)=\xi\) and \(\sigma(\xi)=1.\) Thus \[p(1+\xi)-\sigma(1+\xi)\ln(1+\xi)=\xi-\ln(1+\xi)=-\frac{1}{2}\xi^{2}+\frac{1}{3}\xi^{3}+\ldots.\] Thus Theorem 10.3 implies that the Euler method is of order \(1\) (as we already know).
Example 10.9 • Order of two-step Adams-Bashforth The two-step Adams-Bashforth method corresponds to \(s=2\) and \(\rho(w)=w^{2}-w\) and \(\sigma(w)=\frac{3}{2}w-\frac{1}{2}.\) In terms of \(\xi\) this gives \[\rho(1+\xi)=(1+\xi)^{2}-1-\xi=\xi+\xi^{2}\quad\quad\text{and}\quad\quad\sigma(1+\xi)=\frac{3}{2}(1+\xi)-\frac{1}{2}=1+\frac{3}{2}\xi.\] Thus \[\begin{array}{ll} p(1+\xi)-\sigma(1+\xi)\ln(1+\xi) & =\xi+\xi^{2}-(1+\frac{3}{2})\xi(\xi-\frac{\xi^{2}}{2}+\ldots)\\ & =\xi+\xi^{2}-\xi+\frac{\xi^{2}}{2}-\frac{3}{2}\xi^{2}+\frac{3}{4}\xi^{3}+\ldots\\ & =\frac{3}{4}\xi^{3}+\ldots. \end{array}\] Therefore Theorem 10.3 implies that the method is of order \(p=2\) (as we already know).
Example 10.10 • Order of theta method

In the last chapter we briefly mentioned the family of theta methods. Recall from ([eq:ch9_theta_method_def]) that the method corresponding to a fixed \(\theta\in[0,1]\) is \[y_{n+1}=y_{n}+h(\theta F(t_{n},y)+(1-\theta)F(t_{n+1},y_{n+1}))\] For this method \(\rho(w)=w-1\) and \(\sigma(w)=\theta w+(1-\theta),\) i.e. \(\rho(1+\xi)=\xi\) and \(\sigma(1+\xi)=1+\theta\xi.\) Therefore \[\begin{array}{lcl} p(1+\xi)-\sigma(1+\xi)\ln(1+\xi) & = & \xi-(1+\theta\xi)\ln(1+\xi)\\ & = & \xi-(1+\theta\xi)(\xi-\frac{1}{2}\xi^{2}+\frac{1}{3}\xi^{3}+\ldots)\\ & = & \xi-\xi+\frac{1}{2}\xi^{2}-\frac{1}{3}\xi^{3}-\theta\xi^{2}+\frac{\theta}{2}\xi^{3}+\frac{1}{3}\xi^{3}+\ldots\\ & = & \left(\frac{1}{2}-\theta\right)\xi^{2}-\frac{1}{3}\xi^{3}+\ldots. \end{array}\] Thus the theta method has order \(p=2\) for \(\theta=\frac{1}{2},\) and otherwise order \(p=1.\)

We have already determined these orders for \(\theta=1\) a.k.a. Euler’s method, and \(\theta=\frac{1}{2}\) a.k.a. the trapezoidal rule method. For \(\theta\ne\frac{1}{2},1\) these are new insights. In particular this proves that the backward Euler method for which \(\theta=0\) \[y_{n+1}=y_{n}+hF(t_{n}+1,y_{n+1})\] is of order \(p=1.\)

10.3 Interlude - difference equations

When we study global convergence of multistep methods in the next section we will investigate the behavior of difference equations. These are discrete analogues of differential equations. For instance \[y_{n+1}=y_{n}+ay_{n}+b\] is the difference equation which has some similarity with the ODE \(\dot{y}=ay+b.\) Since difference equations are interesting and useful in their own right, we dedicate this short section to them.

\(\text{ }\)General form. The general, possibly implicit, \(k\)-th order difference equation is \[G(n+k,y_{n},y_{n+1},\ldots,y_{n+k-1},y_{n+k}),n\ge0,\tag{10.24}\] for a general function \(G(t,x_{0},\ldots,x_{k})\) (cf. the general possibly implicit \(k\)-th order ODE ([eq:ch3_implicit_eq])), and the general \(k\)-th order explicit difference equation is \[y_{k+n}=F(n+k,y_{n},y_{n+1},\ldots,y_{n+k-1}),n\ge0,\tag{10.25}\] for a function \(F(t,x_{0},x_{2},\ldots,x_{k-1})\) (cf. the general \(k\)-th order explicit ODE ([eq:ch3_explicit_eq])). Note that the update rules of the multistep method with \(s\) steps from the last section are a difference equation of order \(s,\) which is why the latter pop up in the next section about global convergence.

\(\text{ }\)Solutions. Any sequence \(y_{n},n\ge0,\) satisfying ([eq:ch10_difference_equations_implicit]) or ([eq:ch10_difference_equations_explicit]) is called a solution of the difference equation. Any initial values \(y_{0},\ldots,y_{k-1}\) in \(\mathbb{R}\) or \(\mathbb{C}\) together with ([eq:ch10_difference_equations_implicit]) or ([eq:ch10_difference_equations_explicit]) uniquely define the entire solution \(y_{n},n\ge0\) (note that existence and uniqueness of the solution holds trivially here, unlike for the corresponding ODEs). The equations ([eq:ch10_difference_equations_implicit]) resp. ([eq:ch10_difference_equations_explicit]) are autonomous if the functions \(G,F\) do not depend on \(t.\) We don’t have much to say about difference equations at this level of generality. But for linear autonomous difference equations we can construct a complete theory that essentially mirrors the theory for linear autonomous ODEs.

\(\text{ }\)\(k\)-th order linear autonomous difference equation. The general \(k\)-th order such equation is \[a_{k}y_{n+k}+a_{k-1}y_{n+k-1}+\ldots+a_{1}y_{n+1}+a_{0}y_{n}=b,n\ge0,\tag{10.26}\] for constant coefficients \(a_{0},\ldots,a_{k-1},a_{k},b\) in \(\mathbb{R}\) or \(\mathbb{C}\) with \(a_{k}\ne0\) (cf. the general \(k\)-th order linear autonomous ODE ([eq:ch5_ho_lin_aut_ODE_homo_def])). A general \(k\)-th order homogeneous linear autonomous difference equation is \[a_{k}y_{n+k}+a_{k-1}y_{n+k-1}+\ldots+a_{1}y_{n+1}+a_{0}y_{n}=0,n\ge0,\tag{10.27}\] (cf. the homogeneous ODE ([eq:ch5_ho_lin_aut_ODE_homo_def])). The linear combination of solutions of ([eq:ch10_difference_equations_linear_aut_homo]) are again solutions (i.e. if \(y_{n}^{1},y_{n}^{2}\) are solutions and \(\alpha_{1},\alpha_{2}\in\mathbb{C}\) then the sequence \(z_{n}=\alpha_{1}y_{n}^{1}+\alpha_{2}y_{n}^{2},n\ge0,\) is also a solution), just as Lemma 5.25 states for the corresponding homogeneous ODEs.

\(\text{ }\)Characteristic polynomial. The characteristic polynomial of ([eq:ch10_difference_equations_linear_aut_homo]) is \[p(r)=\sum_{l=0}^{k}a_{l}r^{l},\tag{10.28}\] (cf. the characteristic polynomial for \(k\)-th order linear autonomous ODEs from ([eq:ch5_ho_lin_aut_char_poly_with_ak])). For any \(q\in\mathbb{C}\) the ansatz \(y_{n}=q^{n}\) plugged into ([eq:ch10_difference_equations_linear_aut_homo]) yields \[\sum_{l=0}^{k}a_{l}y_{n+l}=\sum_{l=0}^{k}a_{l}q^{n+l}=q^{n}p(q).\] Thus for any root \(r\) of \(p(r)\) the sequence \(y_{n}=r^{n}\) is a solution (similarly to how \(y(t)=e^{rt}\) solves a linear autonomous homogeneous ODE if \(r\) is a root of its characteristic polynomial). If \(p(r)\) has \(v\) distinct roots \(r_{1},\ldots,r_{v}\) (possibly complex), then for any coefficients \(\alpha_{1},\ldots,\alpha_{v}\in\mathbb{C}\) \[y_{n}=\alpha_{1}r_{1}^{n}+\ldots+\alpha_{v}r_{v}^{n},n\ge0,\] is a \(v\)-parameter family of solution. If the polynomial \(p(r)\) has the full set of \(k\) distinct roots, then this \(k\)-parameter family covers all solutions.

\(\text{ }\)Repeated roots. For the ansatz \(y_{n}=nq^{n}\) we have \[\sum_{l=0}^{k}a_{l}y_{n+l}=\sum_{l=0}^{k}a_{l}(n+l)q^{n+l}=n\sum_{l=0}^{k}a_{l}q^{n+l}+\sum_{l=0}^{k}a_{l}lq^{n+l}=q^{n}p(q)+q^{n+1}p'(q),\] since \[\sum_{l=0}^{k}a_{l}lq^{l}=q\sum_{l=0}^{k}a_{l}lq^{l-1}=qp'(q).\] Thus \(y_{n}=nq^{n}\) is a solution if \(p(q)=p'(q)=0,\) which is the case for any root \(q\) of multiplicity at least two (recall Lemma 5.33). For roots of higher multiplicity we use the identities \[\sum_{l=0}^{k}a_{l}l^{2}q^{l}=\sum_{l=0}^{k}a_{l}l(l-1)q^{l}+\sum_{l=0}^{k}a_{l}lq^{l}=p''(q)q^{2}+p'(q)q,\] \[\sum_{l=0}^{k}a_{l}l^{3}q^{l}=\sum_{l=0}^{k}a_{l}l(l-1)(l-2)q^{l}+\sum_{l=0}^{k}a_{l}l(l-1)q^{l}+\sum_{l=0}^{k}a_{l}lq^{l}=p'''(q)q^{3}+p''(q)q^{2}+p'(q)q,\] or generally \[\sum_{l=0}^{k}a_{l}l^{u}q^{l}=\sum_{i=1}^{u}q^{i}p^{(i)}(q)\quad\quad\text{for}\quad\quad u=0,\ldots,k.\tag{10.29}\] Making the ansatz \(y_{n}=n^{w}r^{n}\) we can now compute \[\sum_{l=0}^{k}a_{l}y_{k+l}=r^{n}\sum_{l=0}^{k}a_{l}(n+l)^{w}r^{l}=r^{n}\sum_{u=0}^{w}{w \choose u}n^{w-u}\sum_{l=0}^{k}a_{l}l^{u}r^{l}.\] The r.h.s. is a linear combination of \(q^{i}p^{(i)}(r)\) for \(i=0,\ldots,w\) by ([eq:ch10_difference_eqs_l_power_identity]). If \(r\) is a root with multiplicity \(m\) then \(p^{(i)}(r)=0,i=0,\ldots,m-1,\) implying that \(y_{n}=n^{w}r^{n}\) is a solution of ([eq:ch10_difference_equations_linear_aut_homo]) for \(w=0,\ldots,m-1.\) Thus if the characteristic polynomial has roots \(r_{1},\ldots,r_{v}\) with multiplicities \(m_{1},\ldots,m_{v}\ge1\) then the functions \[y_{n}=p_{1}(n)r_{1}^{n}+\ldots+p_{v}(n)r_{v}^{n}\] for arbitrary polynomials \(p_{i}\) of degree \(m_{v}-1\) are a \(k\)-parameter family of solutions. It is in fact the exhaustive family of all solutions.

10.4 Global convergence

We now examine the global error of linear multistep methods. That is, we study \[\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|\tag{10.31}\] where \(e_{k}=e_{k}^{h}=y_{k}^{h}-y(t_{k}^{h})\) and the \(y_{n}\) are computed by the general multistep method ([eq:ch9_general_multistep]). Recall that an auxiliary method needs to provide \(y_{1}^{h},\ldots,y_{s-1}^{h}\) before the \(s\)-step method can get started, and these are typically subject to an “initial” error \[\max_{n=0}^{s-1}|e_{n}^{h}|\ne0.\tag{10.32}\] Suppose therefore that we have a sequence \(h=h_{N},N\ge1,\) that converges to zero for \(N\to\infty,\) and for each \(h=h_{N}\) starting values \(y_{0}^{h},\ldots,y_{s-1}^{h},\) and subsequent approximations \(y_{s}^{h},y_{s+1}^{h},\ldots\) derived from the multistep method. We call the multistep method convergent if for any analytical \(F\) and analytical solution \(y\) of \(\dot{y}=F(t,y)\) the global error ([eq:ch10_glob_conv_glob_err]) converges to zero as \(h\downarrow0,\) whenever the error in the starting values ([eq:ch10_glob_conv_init_err]) converges to zero. That is, if \[\lim_{h\downarrow0}\max_{n=0}^{s-1}|e_{n}^{h}|=0\quad\quad\implies\quad\quad\lim_{h\downarrow0}\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|=0.\]

\(\text{ }\)Failure of global convergence. We first give an example of a method whose error is well-behaved locally, but not globally - i.e. consistent but not convergent. The method is \[y_{n+2}-3y_{n+1}+2y_{n}=hs\left(\frac{13}{12}F(t_{n+2},y_{n+2})-\frac{5}{3}F(t_{n+1},y_{n+1})-\frac{5}{12}F(t_{n},y_{n})\right),\tag{10.33}\] which is a two-step implicit multistep method, corresponding to \[\rho(w)=w^{2}-3w+2\quad\quad\text{and}\quad\quad\sigma(w)=\frac{13}{12}w^{2}-\frac{5}{3}w-\frac{5}{12}.\] As in Section 10.2 one can show that this method is of order \(p=2\) using power series and Theorem 10.3, so it is consistent. To show that it is not convergent, it suffices to exhibit just one IVP with analytic \(F\) and analytic solution \(y,\) together with a sequence of starting values \(y_{0}^{h},y_{1}^{h}\) such that \(\max(|e_{0}^{h}|,|e_{1}^{h}|)\downarrow0,\) but ([eq:ch10_glob_conv_glob_err]) remains large even as \(h\downarrow0.\) The very simple IVP \(y'(t)=0,y(0)=0\) with the solution \(y(t)=0\) for all \(t\) does the job, together with the initial values \(y_{0}=0\) and \(y_{1}=h.\) Note that for these initial values indeed \(\max(|e_{0}^{h}|,|e_{1}^{h}|)\downarrow0.\) Since this ODE has \(F=0,\) the approximations \(y_{2},y_{3},\ldots\) generated by the two-step method ([eq:ch10_global_non_convergent_method_example]) are \[y_{n+2}-3y_{n+1}+2y_{n}=0,n\ge0,\quad y_{0}=0,\quad y_{1}=h.\tag{10.34}\] This is a linear autonomous difference equation of order \(k=2,\) whose characteristic polynomial (recall Section 10.3) is the generating polynomial \(\rho(w).\) Since \(\rho(w)=w^{2}-3w+2=(w-1)(w-2)\) the sequence \(y_{n}=u+v2^{n}\) is a solution of ([eq:ch10_global_non_convergent_method_example_F_zero]). With \(u=-h\) and \(v=h\) the solution is \(y_{n}=2^{n}h-h\) which satisfies both the difference equation and the initial conditions \(y_{0}=0,y_{1}=h.\) It is therefore the sequence generated by this multistep method starting with \(y_{0}=0,y_{1}=h.\) The error for later times \(t_{n}\) is thus \[e_{n}^{h}=|y_{n}^{h}|=h(2^{n}-1),\] which does not converge to zero for say \(10\log(h^{-1})\le n\le h^{-1}.\) Thus certainly ([eq:ch10_glob_conv_glob_err]) does not converge to zero for \(h\downarrow0.\)

If one starts the two-step method ([eq:ch10_global_non_convergent_method_example]) with \(y_{0}=y_{1}=0\) instead (i.e. with the correct values from the true solution \(y=0\)) then the solution of the difference equation is \(y_{n}=0,\) which is a perfect approximation. But any small error, e.g. from rounding, that causes \(y_{n}\ne y_{n+1}\) at some point will cause the solution to start diverging.

\(\text{ }\)Condition for convergence. Remarkably, the failure mode seen above (large eigenvalues of \(\rho(w)\)) is the only way consistent method can fail to be convergent. Indeed there is a simple criterion for the eigenvalues of \(\rho\) that together with consistency implies convergence. It is called the root condition, and a multistep method with generating polynomial \(\rho(w)\) satisfies it if \[\begin{array}{c} \text{all roots of }\rho(w)\text{ lie in the complex unit disk, and}\\ \rho(w)\text{ has no repeated roots on the boundary of the unit disk}. \end{array}\tag{10.35}\] Some authors define the terminology stable linear multistep method to mean any linear multistep method that satisfies the condition ([eq:ch10_multistep_gllobal_err_root_cond]). The famous Dahlquist Equivalence Theorem, which we now present, says that if a multistep method is consistent (i.e. of order \(p\) for some \(p\ge1\)) and satisfies the root condition, then it is convergent! (In the other terminology: a stable consistent linear multistep method is convergent).

Theorem 10.11 • Dahlquist Equivalence Theorem

The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).

The full proof of this theorem is outside the scope of this course. We content ourselves with proving the easier of two directions of the “iff”, namely that convergence implies that the method is consistent and the root condition is satisfied. This direction thus proves that the root condition and consistency are necessary conditions for convergence.

Proof of “Convergence \(\implies\) Root condition” of Theorem 10.11. Assume that the root condition is not satisfied, i.e. that \(\rho(w)\) has a root \(w_{1}\) with \(|w_{1}|>1\) or a repeated root \(w_{2}\) with \(\left|w_{2}\right|=1.\) We prove that then the method is not convergent. As was the case when showing the non-convergence of ([eq:ch10_global_non_convergent_method_example]), it suffices to exhibit just one IVP and starting values of small error for which convergence fails. In fact the same trivial IVP \(y'(t)=0,t\ge0,y(0)=0,\) works. Since \(F(t,y)=0\) the approximation produced by the method satisfy the difference equation \[a_{s}y_{n+s}+a_{s-1}y_{n+s-1}+\ldots+a_{1}y_{n+1}+a_{0}y_{n}=0.\tag{10.36}\] If the root \(w_{1}\) with \(|w_{1}|>1\) exists, then consider the solution \(y_{n}=hw_{1}^{n}\) of the difference equation. The first initial \(s\) values \(y_{0},\ldots,y_{n}\) satisfy \[\max_{0\le n\le s-1}|e_{n}^{h}|=|h||w_{1}|^{s-1}\to0\] as \(h\to\infty,\) while say for \(t_{\max}=1\) we have \[\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|\ge|w_{1}|^{\frac{1}{2}\frac{t_{\max}-t_{0}}{h}}=|w_{1}|^{\frac{1}{2}\frac{1}{h}}\to\infty,\] as \(h\to0.\) Thus the method is not convergent. If instead there is a \(w_{2}\) with \(\left|w_{2}\right|=1\) which is a root of multiplicity two or higher, then \[y_{n}=\sqrt{h}nw_{2}^{n}\] is a solution of the difference equation ([eq:ch10_dahlquist_equivalence_root_cond_necc_diff_eq_from_meth]), for which \[\max_{0\le n\le s-1}|e_{n}^{h}|=\sqrt{h}(s-1)\to0,\] while \[\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|\ge\sqrt{h}\frac{t_{\max}-t_{0}}{2h}=\frac{1}{2\sqrt{h}}.\] Thus also in this case the method is not convergent.
Proof of “Convergence \(\implies\) Consistence” of Theorem 10.11. The lowest order terms of the polynomials \(\rho(1+\xi)\) and \(\sigma(1+\xi)\) in \(\xi\) can be written in terms of \(\rho(1),\rho'(1)\) and \(\sigma(1)\) as \[p(1+\xi)=p(1)+p'(1)\xi+O(\xi^{2})\quad\quad\text{and}\quad\quad\sigma(1+\xi)=\sigma(1)+O(\xi).\tag{10.37}\] We will show that \[p(1)=0\quad\quad\text{and}\quad\quad\rho'(1)=\sigma(1).\tag{10.38}\] Together ([eq:ch10_conv_implies_consistent_proof_poly_exp]) and ([eq:ch10_conv_implies_consistent_proof_WTS]) imply that the constant and linear terms in \(\xi\) in the Taylor expansion of \(p(1+\xi)-\sigma(1+\xi)\ln(1+\xi)\) vanish, since \[p(1+\xi)-\sigma(1+\xi)\ln(1+\xi)=\sigma(1)\xi+O(\xi^{2})-(\sigma(1)+O(\xi))(\xi+O(\xi^{2}))=O(\xi^{2}).\] By Theorem 10.3 this implies that the method is of order (at least) \(p=1.\)

Our task is thus to show that a method being convergent implies that its generating polynomials satisfy ([eq:ch10_conv_implies_consistent_proof_WTS]).

Again we consider the very simple ODE \(y'(t)=0,t\ge0,\) this time with initial condition \(y(0)=1,\) whose exact solution is \(y(t)=1,t\ge0.\) As before, applying a convergent multistep method to this ODE amounts to using the rule \[a_{s}y_{n+s}^{h}+\ldots+a_{0}y_{n}^{h}=0.\tag{10.39}\] Starting the method with initial values \(y_{0}^{h}=\ldots=y_{s-1}^{h}=1,\) which are the exact values, the method being convergent certainly implies that the error \(e_{s}^{h}\) converges to zero as \(h\downarrow0.\) From which we deduce that \[\lim_{h\downarrow0}y_{s}^{h}=1.\tag{10.40}\] With these initial values the case \(n=0\) of ([eq:ch10_conv_implies_consistent_proof_p(1)=00003D1_proof_diff_eq]) equals \[a_{s}y_{s}^{h}+a_{s-1}+\ldots+a_{0}=0.\] By taking the limit \(h\downarrow0\) and using ([eq:ch10_conv_implies_consistent_proof_p(1)=00003D1_proof_first_approx_conv]) this implies \[a_{s}+a_{s-1}+\ldots+a_{0}=0.\] The l.h.s. is \(p(1),\) so we have proved the left member of ([eq:ch10_conv_implies_consistent_proof_WTS]).

To prove tor the other part of ([eq:ch10_conv_implies_consistent_proof_WTS]) we consider the IVP \(y'(t)=1,y(0)=0\) with exact solution \(y(t)=t.\) Since this corresponds \(F\equiv1\) identically the r.h.s. of the rule ([eq:ch9_general_multistep]) for this method equals \(h(b_{s}+\ldots+b_{0})=h\sigma(1),\) so it can be written as \[\sum_{l=0}^{s}a_{l}y_{n+l}^{h}=h\sigma(1).\tag{10.41}\] If \(\rho'(1)=0\) in addition to the already proved \(\rho(1)=0,\) then \(w=1\) is a repeated root of \(\rho.\) But then \(\rho\) does not satisfy the root condition ([eq:ch10_multistep_gllobal_err_root_cond]), which is a contradiction since we already proved that every convergent method satisfies it. Thus \(\rho'(1)\ne0,\) and we can define the sequence \(z_{n}^{h}=\frac{\sigma(1)}{\rho'(1)}t_{n}^{h},b\ge0.\) This sequence is a solution to the difference equation ([eq:ch10_conv_implies_consistent_proof_p'(1)=00003Dsigma(1)_proof_diff_eq]), since substituting \(z_{n}^{h}\) for \(y_{n}^{h}\) in the l.h.s. yields \[\sum_{l=0}^{s}a_{l}z_{n}^{h}=\sum_{l=0}^{s}a_{l}\frac{\sigma(1)}{\rho'(1)}t_{n+l}^{h}=\frac{\sigma(1)}{\rho'(1)}\sum_{l=0}^{s}a_{l}(n+l)h=\frac{\sigma(1)}{\rho'(1)}\left(\rho(1)+\rho'(1)\right)h=h\sigma(1).\] Thus when started with the initial values \(y_{n}^{h}=z_{n}^{h}\) for \(n=0,\ldots,s-1,\) applying the present multistep method produces the approximations \[y_{n}^{h}=z_{n}^{h}=\frac{\sigma(1)}{\rho'(1)}t_{n}^{h},n\ge s.\] With this choice of the initial values \(y_{0}^{h},\ldots,y_{s-1}^{h},\) the “initial error” is \[\max_{0\le n\le s-1}|e_{n}^{h}|=\max_{0\le n\le s-1}\left|\frac{\sigma(1)}{\rho'(1)}t_{n}^{h}-t_{n}^{h}\right|=\left|\frac{\sigma(1)}{\rho'(1)}-1\right|\left|t_{s}^{h}\right|=\left|\frac{\sigma(1)}{\rho'(1)}-1\right|sh\downarrow0,\] so the assumption that the method is convergent implies that the global error also tends to zero. In particular, along the sequence \(h_{N}=\frac{1}{N}\) it must hold that \(\left|e_{N}^{h_{N}}\right|\to0\) as \(N\to\infty.\) But \[e_{N}^{h_{N}}=y(t_{N}^{h_{N}})-\frac{\sigma(1)}{\rho'(1)}t_{N}^{hN}=y(1)-\frac{\sigma(1)}{\rho'(1)}=1-\frac{\sigma(1)}{\rho'(1)}.\] Thus we deduce that the method being convergent implies that \(\sigma(1)=\rho'(1),\) proving the second part of ([eq:ch10_conv_implies_consistent_proof_WTS]).

This completes the proof of the implication.

Example 10.12 • Euler - Trapezoidal - Backward Euler - Theta

The theta methods from ([eq:ch9_theta_method_def]), which include Euler’s method, the trapezoidal rule method, and the backward Euler method as special cases, all have \(\rho(w)=w-1.\) This \(\rho\) has just one root \(w=1\) on the boundary of the of the unit disk, but is not repeated, so that the root condition is satisfied. The method is also consistent for any \(\theta\) (indeed it is of order \(p=2\) if \(\theta=\frac{1}{2}\) and order \(p=1\) otherwise). Thus Dahlquist Equivalence Theorem implies that all of these methods are convergent.

Example 10.13 • Adams-Bashforth methods The \(s\)-step Adams-Bashforth methods has \(\rho(w)=w^{s}-w^{s-1}=w^{s}(w-1),\) with the simple (i.e. not repeated) root \(w=1,\) and repeated root \(w=0.\) Heuristically one may think of the global convergence behavior of a method as being better the further away roots are from boundary of the unit disk. It follows from Theorem 10.3 that a consistent method must have \(p(1)=0,\) so the root \(w=1\) on the boundary is always present. From this point of view the Adams-Bashforth methods are in some sense optimal, since the roots are as far away from the boundary of the unit disk as possible.

One should never use non-convergent methods. Even if one happens to behave OK for a specific IVP and step-size \(h,\) the error tends to paradoxically become larger as one shrinks \(h,\) quickly reaching catastrophic levels. Thus multistep methods that do not satisfy the root condition ([eq:ch10_multistep_gllobal_err_root_cond]) are absolutely useless.

\(\text{ }\)Convergence of order \(p.\) Theorem 10.11 only guarantees that the global error converges to zero, it doesn’t provide an upper bound. The following theorem, whose proof we also omit, states that if a linear multistep is of order \(p\ge1\) and satisfies the root condition, then the global error is bounded by \(O(h^{p}).\) Terminology: If the global error of a method applied with analytic \(F,y,\)etc. is always bounded by \(O(h^{p})\) we call a method convergent of order \(p\).
Theorem 10.14 • Order of convergence of linear multistep method

If the linear multistep method ([eq:ch9_general_multistep]) with corresponding polynomials \(\rho(w),\sigma(w)\) from ([eq:ch10_general_multistep_rho_sigma]), has order \(p\ge1\) and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]) then it is convergent of order \(p\ge1.\)

10.5 Deriving methods algebraically

In Section 10.1 we derived the Adams-Bashforth methods by starting with a “clever idea” for reducing the error and working out its consequences - in that specific case the clever idea was polynomial interpolation. This is one common approach for deriving or presenting “good” numerical methods. However the criteria ([eq:ch10_multistep_order_cond]) and ([eq:ch10_multistep_gllobal_err_root_cond]) enable an alternative purely algebraic approach.

For instance, the \(s\)-step Adams-Bashforth methods of can be equivalently constructed as the unique explicit method of order \(s\) with generating polynomial \(\rho(w)=w^{s-1}-w^{s-2}=w^{s-1}(1-w),\) which always satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]) (and satisfies it by a particularly large margin, as mentioned in Example 10.13). Take for instance \(s=3,\) so that \(\rho(w)=w^{2}(1-w).\) For a method to be of order \(3\) the other generating polynomial \(\sigma(w)\) must satisfy \[(1+\xi)^{2}\xi-\sigma(1+\xi)\ln(1+\xi)=O(\xi^{4}),\tag{10.43}\] because Theorem 10.3. To achieve ([eq:ch10_deriving_algebraically]) we must pick \(\sigma\) so as to cancel out the terms of order up to three in \(\xi\) on the l.h.s. To correspond to an explicit method it must hold that \[\sigma(w)=\sum_{m=0}^{2}b_{m}w^{m},\tag{10.44}\] without a third order term. It is easy to find the correct \(\sigma\) by using ([eq:ch10_sigma_ansatz]) as an ansatz in the expansion \[\begin{array}{l} (1+\xi)^{2}\xi-\sigma(1+\xi)\ln(1+\xi)\\ =\xi+2\xi^{2}+\xi^{3}-\left(b_{0}+b_{1}\xi+b_{2}\xi^{2}\right)\left(\xi-\frac{\xi^{2}}{2}+\frac{\xi^{3}}{3}+O(\xi^{4})\right). \end{array}\] For the term \(\xi\) to cancel we must have \(b_{0}=1,\) and then the above equals \[\begin{array}{l} \xi+2\xi^{2}+\xi^{3}-\left(1+c_{1}\xi+c_{2}\xi^{2}+O(\xi^{3})\right)\left(\xi-\frac{\xi^{2}}{2}+\frac{\xi^{3}}{3}+O(\xi^{4})\right).\\ =\frac{5}{2}\xi^{2}+\frac{2}{3}\xi^{3}-\left(c_{1}\xi+c_{2}\xi^{2}+O(\xi^{3})\right)\left(\xi-\frac{\xi^{2}}{2}+\frac{\xi^{3}}{3}+O(\xi^{4})\right)+O(\xi^{4}). \end{array}\] For the term \(\xi^{2}\) to cancel we need \(b_{1}=\frac{5}{2},\) for which the above turns into \[\begin{array}{l} \frac{5}{2}\xi^{2}+\frac{2}{3}\xi^{3}-\left(\frac{5}{2}\xi+c_{2}\xi^{2}+O(\xi^{3})\right)\left(\xi-\frac{\xi^{2}}{2}+\frac{\xi^{3}}{3}+O(\xi^{4})\right)+O(\xi^{4}).\\ =\frac{24}{12}\xi^{3}-\left(c_{2}\xi^{2}+O(\xi^{3})\right)\left(\xi-\frac{\xi^{2}}{2}+\frac{\xi^{3}}{3}+O(\xi^{4})\right)+O(\xi^{4}). \end{array}\] For the term \(\xi^{3}\) to cancel we need \(b_{2}=\frac{24}{12}.\) Thus to achieve an explicit method of order \(s\) the only choice is \[\sigma(1+\xi)=1+\frac{5}{2}\xi+\frac{24}{12}\xi^{2}.\] This means that \[\sigma(w)=1+\frac{5}{2}(w-1)+\frac{24}{12}(w-1)^{2}=\frac{5}{12}-\frac{4}{3}w+\frac{23}{12}w^{2}.\] These coefficients and those of \(\rho(w)=w^{3}-w^{2}\) yield the method \[y_{n+3}=y_{n+1}+h\left(\frac{5}{12}F(t_{n},y_{n})-\frac{4}{3}F(t_{n+1},y_{n+1})+\frac{23}{12}F(t_{n+2},y_{n+2})\right).\] This is precisely the \(3\)-step Adams-Bashforth method.

Home

Contents

Weeks