10.6 Adams-Moulton methods
In the previous section we saw that the three-step Adams-Bashforth method is the unique convergent three step third order explicit linear multistep method with characteristic polynomial \(\rho\) given by \(\rho(w)=w^{3}-w^{2}.\) This characterization is general: for any \(s\ge1,\) the unique convergent \(s\)-order \(s\)-step explicit linear multistep method with the characteristic polynomial \(\rho\) given by \(\rho(w)=w^{s}-w^{s-1}\) is the \(s\)-step Adams-Bashforth method.
The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).
(One step \(s=1\)). For \(s=1\) steps the Adams-Bashforth method has \(\rho(w)=w-1,\) and \(\sigma(w)=1\) (it coincides with the Euler’s method). It is of order one, i.e. \[\xi-\sigma(1+\xi)\log(1+\xi)=O(\xi^{2}).\tag{10.45}\] If we allow for an implicit method we can add a term to \(\sigma(1+\xi)\) which is linear in \(\xi.\) we make the ansatz \(\sigma(1+\xi)=1+a\xi\) for a to be determined. Then \[\begin{array}{ccl} \sigma(1+\xi)\ln(1+\xi) & = & (1+a\xi)\ln(1+\xi)\\ & = & \ln(1+\xi)+a\xi\ln(1+\xi)\\ & = & (\xi-\frac{\xi^{2}}{2}+O(\xi^{3}))+a(\xi^{2}+O(\xi^{3}))\\ & = & \xi+\xi^{2}(a-\frac{1}{2})+O(\xi^{3}). \end{array}\] Thus if \(a=\frac{1}{2}\) then the r.h.s. of ([eq:ch10_am_1_ps]) is \(O(\xi^{3}),\) and the corresponding method is second order.
To write down the method concretely we substitute \(\xi=w-1\) in \(\sigma\) to obtain \[\sigma(w)=1+\frac{w-1}{2}=\frac{1+w}{2}.\] The characteristic polynomials \(\rho(w)=w-1\) and \(\sigma(w)=\frac{1+w}{2}\) correspond to the method \[y_{n+1}-y_{n}=h\frac{F(t_{n},y_{n})+F(t_{n+1},y_{n+1})}{2}.\tag{10.46}\] This coincides with the trapezoidal rule method from ([eq:ch9_trap_intro_rule]). It is an implicit method of order \(2,\) one more order than the corresponding explicit method.
(Two step \(s=2\)). For \(s=2\) we have \(\rho(w)=w^{2}-w,\) corresponding to \(\rho(1+\xi)=(1+\xi)=\xi+\xi^{2}.\) We seek \(\sigma\) of degree \(2\) such that \[\xi+\xi^{2}-\sigma(1+\xi)\log(1+\xi)=O(\xi^{4}).\] With the ansatz \(\sigma(1+\xi)=a+b\xi+c\xi^{2}\) the l.h.s. equals \[\begin{array}{l} \xi+\xi^{2}-\sigma(1+\xi)\log(1+\xi)\\ =\xi+\xi^{2}-\sigma(1+\xi)\xi-\sigma(1+\xi)\frac{\xi^{2}}{2}+\sigma(1+\xi)\frac{\xi^{3}}{3}+O(\xi^{4})\\ =\xi+\xi^{2}-(a+b\xi+c\xi^{2})\xi+(a+b\xi+O(\xi^{3}))\frac{\xi^{2}}{2}+(a+O(\xi))\frac{\xi^{3}}{3}+O(\xi^{4})\\ =\xi(1-a)+\xi^{2}(1-b+\frac{a}{2})+\xi^{3}\left(-c+\frac{b}{2}+\frac{a}{3}\right)+O(\xi^{4}). \end{array}\] The first three terms vanish if \(a=1\) and \(b=\frac{3}{2}\) and \(c=\frac{3}{4}+\frac{1}{3}=\frac{5}{12},\) corresponding to \[\sigma(1+\xi)=1+\frac{3}{2}\xi+\frac{5}{12}\xi^{2}.\] To write down the corresponding method, substitute \(\xi=w-1\) in \(\sigma\) to obtain \[\sigma(w)=1+\frac{3}{2}(w-1)+\frac{5}{12}(w-1)^{2}=-\frac{1}{12}+\frac{2}{3}w+\frac{5}{12}w^{2}.\] This is the method \[y_{n+2}-y_{n+1}=h\left(\frac{5}{12}F(t_{n+2},y_{n+2})+\frac{2}{3}F(t_{n+1},y_{n+1})-\frac{1}{12}F(t_{n},y_{n})\right).\tag{10.47}\] It is an implicit method of order \(3,\) one more order than the \(2\)-step Adams-Bashforth method.
In general, for every \(s\) there is a unique \(\sigma\) of degree \(s\) such that \[(1+\xi)^{s-1}\xi-\sigma(1+\xi)\log(1+\xi)=O(\xi^{s+2}).\] These methods are called the Adams-Moulton methods. They are in some sense the implicit version of the Adams-Bashforth methods. The one-step Adams-Moulton method is ([eq:AM_1]) of order \(2,\) and the two-step Adams-Moulton method is ([eq:AM_2]) of order \(3.\)
10.7 Backward differentiation formulas
Another class of multistep methods are obtained by deciding that one wants \(\sigma(w)=w^{s},\) which corresponds to a method of the form \[\sum_{m=0}^{s}a_{m}y_{m}=hF(y_{n+s},t_{n+s}).\] These are in some sense “maximally implicit”, since only the term that makes a method implicit appears on the r.h.s.
The assumption \(\sigma(w)=w^{s}\) corresponds to \(\sigma(1+\xi)=(1+\xi)^{s}.\) The quantity ([eq:order_cond_taylor_in_xi]) then equals \[\rho(1+\xi)-(1+\xi)^{s}\log(1+\xi).\]
(One step \(s=1\)). For \(s=1\) we have \[(1+\xi)\ln(1+\xi)=\xi+\frac{\xi^{2}}{2}-\frac{\xi^{3}}{6}+\ldots,\] so the maximal order is obtained by setting \(\rho(1+\xi)=\xi.\) This corresponds to \(\rho(w)=w-1,\) i.e. to the method \[y_{n+1}-y_{n}=hF(y_{n+1},t_{n+1}).\tag{10.48}\] This is the backwards Euler method, of order \(1.\)
(One step \(s=2\)). For \(s=2\) we have \[(1+\xi)\ln(1+\xi)=\xi+\frac{3}{2}\xi^{2}+\frac{\xi^{3}}{3}+\ldots,\] so the maximal order is obtained when \(\rho(1+\xi)=\xi+\frac{3}{2}\xi^{2}.\) This corresponds to \[\rho(w)=w-1+\frac{3}{2}(w-1)^{2}=\frac{3}{2}w^{2}-2w+\frac{1}{2},\] i.e. \[\frac{3}{2}y_{n+2}-2y_{n+1}+\frac{1}{2}y_{n}=hF(y_{n+1},t_{n+1}).\] Normalizing so that the coefficient of \(y_{n+2}\) is \(1\) this is equivalently written as \[y_{n+2}-\frac{4}{3}y_{n+1}+\frac{1}{3}y_{n}=\frac{2}{3}hF(y_{n+1},t_{n+1}).\tag{10.49}\] This method is of order \(2.\) To determine if the method is convergent we must verify the root condition. Since \[w^{2}-\frac{4}{3}w+\frac{1}{3}=\left(w-1\right)\left(w-\frac{1}{3}\right)\] the roots of \(\rho\) are \(w=1,w=\frac{1}{3}\) the root condition is satisfied!
The methods that arise like this from the choice \(\sigma(w)=w^{s}\) are called Backward differentiation formula (BDF). The one-step BDF is ([eq:BFD_1]) and the two-step BDF ([eq:BFD_2]). The \(s\)-step method is of order \(s.\) It turns out that for \(s=1,2,3,4,5,6\) the \(\rho\) of the \(s\)-step BDF satisfies the root condition, while for \(s\ge7\) the root condition is unfortunately not satisfied. Thus the method is convergent only for \(s\in\{1,\ldots,6\}\) steps.
10.8 Dahlquist barriers
The \(3\)-step Adams-Bashforth method is of order \(3,\) while its implicit counterpart the \(3\)-step Adams-Moulton method is of order \(4.\) The \(3\)-step BDF is of order \(3.\) There exist \(3\)-step methods of higher order. For instance \[\begin{array}{l} y_{n+3}+\frac{27}{11}y_{n+2}-\frac{27}{11}y_{n+1}-y_{n}\\ =h\left(\frac{3}{11}F(t_{n+3},y_{n+3}+\frac{27}{11}F(t_{n+2},y_{n+2})+\frac{27}{11}f(t_{n+1},y_{n+1})+\frac{3}{11}F(t_{n},y_{n})\right) \end{array}\] is a \(3\)-step method of order \(6\)! Unfortunately, it does not satisfy the root condition and is not convergent. Even more unfortunately, this is the case in general, as proved by the next “barrier” theorem, which characterizes the highest order of a linear \(s\)-step multistep method.
The order \(p\) of a convergent \(s\)-step linear multistep method satisfies
\(p\le s+2\) if \(s\) is even,
\(p\le s+1\) if \(s\) is odd,
\(p\le s\) if the method is explicit.
The proof of this remarkable theorem is outside the scope of the course. The Adams-Bashforth \(s\)-step method “achieves the barrier” for explicit method, since it has order \(s.\) By the theorem no other explicit method achieves a higher order, so we don’t have to waste effort trying to find clever \(\sigma,\rho\) that give an explicit method of order \(s+1\) or above!
The order \(p\) of a convergent \(s\)-step linear multistep method satisfies
\(p\le s+2\) if \(s\) is even,
\(p\le s+1\) if \(s\) is odd,
\(p\le s\) if the method is explicit.
10.9 Summary
The following table summarizes the the multistep methods we have considered in this chapter.
| Order | Explicit or implicit | Convergent for | |
|---|---|---|---|
| Adams-Bashforth | \(s\) | Explicit | \(1\le s<\infty\) |
| Adams-Moulton | \(s+1\) | Implicit | \(1\le s<\infty\) |
| Backward differentiation formula | \(s\) | Implicit | \(1\le s\le6\) |
11 Numerics III: Runge-Kutta methods
In this chapter we study a very important class of numerical schemes called Runge-Kutta (RK) methods. Instead of using multiple previous approximations like the multistep methods of the previous chapter, they use multiple intermediate approximations. These intermediate approximations approximate the solution \(y(t)\) for several \(t\in[t_{n},t_{n+1}].\) Like for several other methods we have seen, the starting point of the derivation of the Runge-Kutta methods is an approximation scheme for a definite integral \(\int_{t_{n}}^{t_{n+1}}f(t)dt.\) Applying the scheme to approximate \(\int_{t_{n}}^{t_{n+1}}F(t,y(t))dt\) yields a time-stepping method (where of course \(y\) is the the solution to an ODE \(y'=F(t,y)\)). Runge-Kutta methods are somewhat similar in spirit to quadrature rules for approximating definite integrals. As a warm-up, the first section introduces quadrature and derives the especially accurate Gaussian quadrature rules.
11.1 Gaussian quadrature
Quadrature rules are the building blocks of methods for numerically approximating definite integrals. A quadrature rule for an interval \([a,b]\) approximates a definite integral \(\int_{a}^{b}f(t)dt\) by a linear combination, i.e. \[\int_{a}^{b}f(t)dt\approx\sum_{j=1}^{k}b_{j}f(c_{j})\tag{11.1}\] with a fixed and small number \(k\) of summands and fixed parameters \(a\le c_{1}<\ldots<c_{k}\le b,b_{1},\ldots,b_{k}.\) The parameters \(c_{i}\) are called nodes and the parameters \(b_{k}\) weights.
\(\text{ }\)Small intervals. Clearly for say \([a,b]=[0,1]\) and \(k=2\) one can’t expect the error in this approximation to be particularly small in general. Arbitrarily small error is achieved by applying the quadrature rule in small intervals of length \(h,\) with \(h\downarrow0.\) Applied to a small interval \([t_{n},t_{n+1}]=[t_{n},t_{n}+h]\) a quadrature for \([0,1]\) turns into \[\int_{t_{n}}^{t_{n+1}}f(t)dt\approx h\sum_{j=1}^{k}b_{j}f(t_{n}+c_{j}h),\] and setting \(h=\frac{1}{N}\) and \(t_{n}=nh,n\ge0\) for a large integer \(N\) and summing over \(n\) yields the approximation \[\int_{0}^{1}f(t)dt\approx\sum_{n=0}^{N-1}h\sum_{j=1}^{k}b_{j}f(t_{n}+c_{j}h).\tag{11.2}\] Note that the left Riemann sum approximation arises from the quadrature rule for \([0,1]\) given by \(k=1,c_{1}=0,b_{1}=1,\) and the trapezoidal rule approximation ([eq:ch9_trapezoidal_rule_integral]) arises from the quadrature rule \(k=2,c_{1}=0,c_{1}=1,b_{1}=b_{2}=\)\(\frac{1}{2}.\)
\(\text{ }\)From local to global error bounds. Error bounds for \[\left|\int_{0}^{1}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\tag{11.3}\] translate to error bounds for the approximation ([eq:ch11_gaus_quad_global_approx_intro]). For left Riemann sum quadrature rule the error ([eq:ch11_guas_quad_error_0_1]) satisfies \[\left|\int_{0}^{1}f(t)dt-f(0)\right|\le\sup_{t\in[0,1]}|f'(t)|,\tag{11.4}\] and for the trapezoidal quadrature rule it satisfies \[\left|\int_{0}^{1}f(t)dt-\frac{f(0)+f(1)}{2}\right|\le c\sup_{t\in[0,1]}|f''(t)|,\tag{11.5}\] both for say analytic \(f.\) The next lemma deduces an error bound for the approximation ([eq:ch11_gaus_quad_global_approx_intro]) from a bound for ([eq:ch11_guas_quad_error_0_1]) of the type of ([eq:ch11_gaus_quad_rieman_quad_error])-([eq:ch11_gaus_quad_trapezoidal_quad_error]).
Consider a quadrature rule \(k\ge1,\) \(0\le c_{1}<\ldots<c_{k}\le1\) and \(b_{1},\ldots,b_{k}\in\mathbb{R}\) for the interval \([0,1].\) Assume that there is a constant \(c\in[0,\infty)\) and a \(p\ge1\) such that for any analytic \(f:[0,1]\to\mathbb{R}\) \[\left|\int_{0}^{1}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le\kappa\sup_{t\in[0,1]}|f^{(p)}(t)|.\tag{11.6}\] Then for any analytic \(g:[0,1]\to\mathbb{R}\) and \(h=\frac{1}{N}\) for an integer \(N\ge1\) \[\left|\int_{0}^{1}g(t)dt-\sum_{n=0}^{N-1}h\sum_{j=1}^{k}b_{j}g(t_{n}+hc_{j})\right|\le h^{p}\kappa\sup_{t\in[0,1]}|g^{(p)}(t)|.\tag{11.7}\]
Proof. The l.h.s. of ([eq:ch11_guas_quad_global_error]) is at most \[\sum_{n=0}^{N-1}e_{n}\tag{11.8}\] where \[e_{n}=\left|\int_{t_{n}}^{t_{n+1}}g(t)dt-h\sum_{j=1}^{k}b_{j}g(t_{n}+hc_{j})\right|.\] With the change of variables \(t=t_{n}+sh\) and \(r_{n}(s)=g(t_{n}+sh)\) it follows that \[\int_{t_{n}}^{t_{n+1}}g(t)dt=h\int_{0}^{1}g(t_{n}+sh)ds=h\int_{0}^{1}r_{n}(s)ds.\] Thus \[e_{n}=h\left|\int_{0}^{1}r_{n}(s)ds-\sum_{j=1}^{k}b_{j}r_{n}(c_{j})\right|.\] Thus by ([eq:ch11_gaus_quad_quad_error]) \[e_{n}\le\kappa h\sup_{t\in[0,1]}\left|r_{n}^{(p)}(t)\right|\tag{11.9}\] Furthermore \(r_{n}^{(p)}(s)=h^{p}g^{(p)}(t_{n}+hs)\) so that \[\max_{n=0}^{N-1}\sup_{s\in[0,1]}\left|r_{n}^{(p)}(s)\right|\le h^{p}\sup_{t\in[0,1]}|g^{(p)}(t)|.\] This implies that \(e_{n}\le h^{p+1}\kappa\sup_{t\in[0,1]}|g^{(p)}(t)|,\) and plugging this into ([eq:ch11_guas_quad_global_error_first_step]) yields ([eq:ch11_guas_quad_global_error]).
The fully general definition of quadrature allows for a weight function \(w:[a,b]\to[0,\infty)\) and approximates the weighted integral \(\int_{a}^{b}f(t)w(t)dt\) by a sum \(\sum_{j=1}^{k}b_{j}f(c_{j}).\) We limit ourselves to the case of a constant weight function \(w(t)=1.\)
The goal of this section is to construct optimal quadrature rules for all \(k\ge1,\) in the sense that they satisfy ([eq:ch11_gaus_quad_quad_error]) for the largest possible \(p.\)
First we introduce the concept of order of a quadrature rule. For each \(p\) let \(\mathbb{P}_{p}\) denote the set of all polynomials of degree at most \(p,\) which is a real finite-dimensional vector space.
Let \([a,b]\) be a non-empty interval, and let \(k\ge1,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}\) be the parameters of a quadrature rule for \([a,b].\) If \(p\ge1\) and \[\int_{a}^{b}f(t)dt=\sum_{j=1}^{k}b_{j}f(c_{j})\tag{11.11}\] (i.e. the quadrature rule approximates \(\int_{a}^{b}f(t)dt\) with zero error) for every \(f\in\mathbb{P}_{p-1},\) then the quadrature is said to be of order (at least) \(p\).
If the quadrature rule of order \(p\) but not of order \(p+1\) then it is of order exactly \(p.\)
The next lemma shows that the bound ([eq:ch11_gaus_quad_quad_error]) holds if the order of quadrature is (at least) \(p.\) What’s more, a bound like ([eq:ch11_gaus_quad_quad_error]) holding for some \(p=m\) but not for \(p=m+1\) is equivalent to the method being of order \(m.\)
Fix a quadrature rule \(k,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}.\) Let \(p\ge1.\)
a) If the quadrature rule is of order \(p\) then \[\left|\int_{a}^{b}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le c\sup_{t\in[a,b]}|f^{(p)}(t)|\:\:\text{for all analytic }f:[a,b]\to\mathbb{R},\tag{11.12}\] where \(c=(k+1)\frac{(b-a)^{p}}{p!}.\)
b) Conversely, assume that that ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds for some constant \(c\in[0,\infty).\) Then the quadrature rule is of order (at least) \(p.\)
c) Let \(m\ge1.\) The quadrature rule is of order exactly \(m\) iff ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds when \(p=m\) for some constant \(c\in[0,\infty),\) but does not hold for any constant \(c\in[0,\infty)\) when \(p=m+1.\)
b) Polynomials are analytic and if they are of degree at most \(p-1\) then \(f^{(p)}(t)=0\) for all \(t.\) Thus if ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds, then the identity ([eq:ch11_gauss_quad_order_def_eq]) holds for all such polynomials of degree at most \(p-1,\) which implies that Definition 11.3
Let \([a,b]\) be a non-empty interval, and let \(k\ge1,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}\) be the parameters of a quadrature rule for \([a,b].\) If \(p\ge1\) and \[\int_{a}^{b}f(t)dt=\sum_{j=1}^{k}b_{j}f(c_{j})\tag{11.11}\] (i.e. the quadrature rule approximates \(\int_{a}^{b}f(t)dt\) with zero error) for every \(f\in\mathbb{P}_{p-1},\) then the quadrature is said to be of order (at least) \(p\).
If the quadrature rule of order \(p\) but not of order \(p+1\) then it is of order exactly \(p.\)
a) By Taylor’s theorem with remainder it holds for any analytic \(f\) and \(p\) that \[f(t)=q(t)+\frac{(t-a)^{p}}{p!}f^{(p)}(s)\] for some \(s\in[a,b],\) where \[q(t)=\sum_{j=0}^{p-1}\frac{(t-a)^{j}}{j!}f^{(j)}(t)\] is a polynomial of degree at most \(p-1.\) In particular for \(t=c_{j}\) \[f(c_{j})=q(c_{j})+\frac{(c_{j}-a)^{p}}{p!}f^{(p)}(s_{j})\] for some \(s_{j}\in[a,b].\) The assumption that the quadrature is of order at least \(p\) implies that \[\int_{a}^{b}q(t)dt=\sum_{j=1}^{k}b_{j}q(c_{j}),\] so \[\left|\int_{a}^{b}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le\frac{(t-a)^{p}}{p!}|f^{(m)}(s)|+\sum_{j=1}^{k}\frac{(c_{j}-a)^{p}}{p!}|f^{(m)}(s_{j})|,\] which implies ([eq:ch11_quad_order_p_error_rate_error_bound_c]) with the claimed constant \(c.\)
The next lemma shows that for any \(k\ge1\) and distinct nodes \(c_{1},\ldots,c_{k}\) there exist \(b_{1},\ldots,b_{k}\) such that the quadrature rule corresponding to these \(b_{i},c_{i}\) has order at least \(k.\) Recall that the Vandermonde matrix of a sequence \(x_{1},\ldots,x_{k}\) is the matrix \(k\times k\) matrix \[V(x_{1},\ldots,x_{k}):=\left(\begin{matrix}1 & x_{1} & x_{1}^{2} & \dots x_{1}^{k-1}\\ 1 & x_{2} & x_{2}^{2} & \dots x_{2}^{k-1}\\ \vdots & \vdots & \vdots & \ddots\vdots\\ 1 & x_{k} & x_{k}^{2} & \dots x_{k}^{k-1} \end{matrix}\right)\] Recall furthermore that \(\det(V(x_{1},\ldots,x_{k}))=\prod_{1\le i<j\le k}(x_{i}-x_{j}),\) so a Vandermonde matrix is invertible whenever the \(x,\ldots,x_{k}\) are distinct.
For any non-empty \([a,b],k\ge1\) and distinct nodes \(a\le c_{1}<\ldots<c_{k}\le b\) the system of equations \[\sum_{j=1}^{k}b_{j}c_{j}^{m}=\int_{a}^{b}t^{m}dt,m=0,\ldots,k-1\tag{11.14}\] uniquely determines weights \(b_{1},\ldots,b_{k}.\) The quadrature combining these \(b_{i}\) and \(c_{i}\) is of order at least \(k.\)
Proof. The system of equations ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) is equivalent to \(b^{T}V(c_{1},\ldots,c_{k})=v,\) for the column vector \(v\in\mathbb{R}^{k}\) with \(v_{m+1}=\int_{a}^{b}t^{m}dt,m=0,\ldots,k-1.\) Since the \(c_{i}\) are distinct the Vandermonde matrix is non-degenerate. Thus a unique solution \(b=(b_{1},\ldots,b_{k})\in\mathbb{R}^{k}\) exists. Next let \(f(t)=t^{m}\) and note that by ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) \[\sum_{j=1}^{k}b_{j}f(c_{j})-\int_{a}^{b}t^{m}dt=\sum_{j=1}^{k}b_{j}c_{j}^{m}-\int_{a}^{b}t^{m}dt=0\quad\quad m=0,\ldots,k-1.\] If \(f\) is a polynomial of degree at most \(k-1\) then it is a linear combination of \(t^{m},m=0,\ldots,k-1,\) so it follows that for all such \(f\) \[\sum_{j=1}^{k}b_{j}f(c_{j})-\int_{a}^{b}f(t)dt=0.\] This implies that the method is of order at least \(k.\)
This means that we can construct a quadrature with better error bound that the trapezoidal rule method (for which \(k=2\)) by simply picking any \(k=3\) nodes \(c_{1},c_{2},c_{3}\) and solving for \(b_{1},b_{2},b_{3}\) - this will yield a quadrature of order \(p=3.\) It turns out however, than one can do significantly better! Indeed for any \(k\ge1\) one can obtain a quadrature of order \(2k\) by choosing \(c_{1},\ldots,c_{k}\) in a particular way! Furthermore, this is an optimal quadrature: no other rule has higher order.
These special \(c_{1},\ldots,c_{k}\) arise as the roots of certain polynomials. More specifically, they are the roots of orthogonal polynomials with respect to the interval \([a,b].\) To define them we introduce the inner-product \[\left\langle f,g\right\rangle _{a,b}=\int_{a}^{b}f(t)g(t)dt\] on the finite dimensional linear vector space \(\mathbb{P}_{l},\) for any \(l.\) The orthogonal polynomials w.r.t. to \([a,b]\) are defined as \[\begin{array}{c} \begin{array}{c} \text{the }k\text{-th orthogonal polynomial }P_{k}=P_{k,[a,b]}\text{ is the unique }\\ \text{monic }P_{k}\in\mathbb{P}_{k}\text{ such that }\left\langle P_{k},f\right\rangle _{a,b}=0\text{ \,for all }f\in\mathbb{P}_{k-1}. \end{array}\end{array}\tag{11.15}\] To see that ([eq:ch11_guass_quad_legendre_poly_def]) indeed defines a sequence \(P_{0},P_{1},P_{2},\ldots\) of polynomials uniquely, recall that on any finite dimensional inner-product space one can define the linear projections onto any subspace. Thus we can define the linear projection \(\Pi^{\mathbb{P}_{k-1}^{\bot}}\) from the finite-dimensional \(\mathbb{P}_{k}\) onto the subspace \(\mathbb{P}_{k-1}^{\bot}\) orthogonal to \(\mathbb{P}_{k-1}.\) Then if \(q_{k}\) is any polynomial of order exactly \(k\) - e.g. \(q_{k}(t)=t^{k}\) - then the projection \(P_{k}=\Pi^{P_{k-1}^{\bot}}q_{k}\) is a non-zero polynomial in \(\mathbb{P}_{k}\backslash\mathbb{P}_{k-1}\) that satisfies \(\left\langle P_{k},p\right\rangle =0\) for all \(p\in\mathbb{P}_{k-1}.\) Normalizing it to be monic shows that for every \(k\) a \(P_{k}\) satisfying ([eq:ch11_guass_quad_legendre_poly_def]) exists. To show uniqueness, note that if \(P_{k}\) and \(Q_{k}\) are two monic polynomials of degree \(k\) satisfying ([eq:ch11_guass_quad_legendre_poly_def]), then \(r=P_{k}-Q_{k}\) is polynomial of degree at most \(k-1\) and so \(\left\langle r,r\right\rangle =\left\langle P_{k},r\right\rangle -\left\langle Q_{k},r\right\rangle =0,\) implying that \(\left\langle r,r\right\rangle =0\) and thus \(P_{k}=Q_{k}.\)
The orthogonal polynomials for \([-1,1]\) are called the Legendre polynomials. Let us denote \[\text{denote the }k\text{-th Legendre polynomial by }Q_{k}.\tag{11.16}\] The polynomials \(Q_{k}\) can be derived by the Gram-Schmidt process applies to the basis \(1,t,t^{2},\ldots,t^{k}\) of \(\mathbb{P}_{k}.\) This yields \(Q_{1}(t)=1.\) Next \[Q_{1}(t)=t-\left\langle t,Q_{0}\right\rangle Q_{0}=t,\] (where we drop the subscript on \(\left\langle \cdot,\cdot\right\rangle _{-1,1}\)) and then \[Q_{2}(t)=t^{2}-\left\langle t^{2},Q_{1}\right\rangle Q_{1}-\left\langle t^{2},Q_{0}\right\rangle Q_{0}=t^{2}-\int_{-1}^{1}t^{2}dt=t^{2}-\frac{2}{3},\] and so on. The first four Legendre polynomials are \[\begin{array}{ccl} Q_{0}(t) & = & 1,\\ Q_{1}(t) & = & t,\\ Q_{2}(t) & = & t^{2}-\frac{2}{3},\\ Q_{3}(t) & = & t^{3}-\frac{3}{5}t. \end{array}\tag{11.17}\] For other \([a,b]\) the orthogonal polynomials can either be derived by Gram-Schmidt, or written in terms of the Legendre polynomials \(Q_{k}\) via \[P_{k}(t)=P_{k,[a,b]}(t)=(b-a)^{k}Q_{k}\left(\frac{t-\frac{1}{2}(a+b)}{b-a}\right),\tag{11.18}\] (one can check that these \(P_{k}\) are monic and for any \(f\in\mathbb{P}_{k-1}\) it follows that \(\left\langle P_{k},f\right\rangle _{a,b}=0\) from the fact that \(\left\langle Q_{k},f\right\rangle _{-1,1}=0\)).
For every \(k\ge1\) and non-empty interval \([a,b]\) the corresponding orthogonal polynomial \(P_{k}\) has exactly \(k\) distinct roots, which all lie in \([a,b].\)
Proof. Let \(x_{1},\ldots,x_{m}\) be the distinct roots of \(P_{k}\) in \([-1,1].\) Define the polynomial \[q(t)=\prod_{i=1}^{m}(t-x_{i}).\] The function \(P_{k}(t)\) changes sign exactly \(m\) times in \([-1,1],\) and the changes happen at the roots \(x=x_{k}.\) The same applies to \(q(t).\) Therefore \(P_{k}(t)q(t)\ge0\) for all \(t\in[-1,1]\) or \(P_{k}(t)q(t)\le0\) for all \(t\in[-1,1].\) Therefore \(\left\langle P_{k},q\right\rangle =\int P_{k}(t)q(t)dt\ne0.\) The polynomial \(q\) is of degree \(m.\) If \(m\le k-1\) then it has degree at most \(k-1,\) so by definition of \(P_{k}\) it follows that \(\left\langle P_{k},q\right\rangle =0.\) This is a contradiction, so it must hold that \(m\ge k.\)
Lastly, \(P_{k}\) is of degree \(k\) so by the fundamental theorem of algebra \(P_{k}\) can have at most \(k\) roots - thus \(m=k\) and \(x_{1},\ldots,x_{k}\) are all the roots of \(P_{k}.\)
We now show that a quadrature rule where the nodes are the roots of an orthogonal polynomial achieves the highest possibly order.
If a non-empty interval \([a,b].\) Let \(k\ge1.\) If \(c_{1},\ldots,c_{k}\) are the roots of \(P_{k}\) and \(b_{1},\ldots,b_{k}\) the unique solutions of ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) then:
The corresponding quadrature is of order \(2k.\)
No other quadrature has higher order.
For any non-empty \([a,b],k\ge1\) and distinct nodes \(a\le c_{1}<\ldots<c_{k}\le b\) the system of equations \[\sum_{j=1}^{k}b_{j}c_{j}^{m}=\int_{a}^{b}t^{m}dt,m=0,\ldots,k-1\tag{11.14}\] uniquely determines weights \(b_{1},\ldots,b_{k}.\) The quadrature combining these \(b_{i}\) and \(c_{i}\) is of order at least \(k.\)
It remains to show that no rule has higher order. Fix an arbitrary quadrature with parameters \(b_{i},c_{i}.\) Let \[f(t)=\prod_{i=1}^{k}(t-c_{i})^{2},\] which is a polynomial of degree at most \(2k.\) We have \[\sum_{i=1}^{k}b_{i}f(c_{i})=0,\] since \(c_{1},\ldots,c_{k}\) are roots of \(f(t),\) while \[\int_{a}^{b}f(t)dt>0,\] since \(f(t)\) is non-negative and not identically zero. Thus the quadrature is not exact for this \(f(t),\) so it is not of order \(2k+1.\)
11.2 Explicit Runge-Kutta (ERK) schemes
As we have done a few times before, let us take the task of approximating the integral on the r.h.s. of the identity \[y(t_{n+1})-y(t_{n})=\int_{t_{n}}^{t_{n+1}}F(t,y(t))dt\tag{11.25}\] as a starting point for devising a numerical time-stepping method (where of course \(y(t)\) is the solution of a generic ODE \(y'=F(t,y)\)). In particular, inspired by the quadrature rules of the previous section we fix \(k\ge1,\) distinct nodes \(c_{1},\ldots,c_{k}\) and weights \(b_{1},\ldots,b_{k}\) and try to approximate the r.h.s. by \[h\sum_{j=1}^{k}b_{j}F(t_{n}+c_{j}h,y(t_{n}+c_{j}h)).\tag{11.26}\] This does not immediately yield a numerical time-stepping method of the type that produces a next approximation \(y_{n+1}\) from the previous approximation \(y_{n},\) since in such a setting we do not have approximations for the \(y(t_{n}+c_{j}h)\) (except possibly for \(j=1,\) in the special case \(c_{1}=0\)). The idea of Runge-Kutta (RK) methods is to first compute intermediate approximations \[\xi_{j}\approx y(t_{n}+c_{j}h)\tag{11.27}\] and then use them in the final approximation \[y(t_{n+1})\approx y_{n+1}=y_{n}+h\sum_{j=1}^{k}b_{j}F(t_{n}+c_{j}h,\xi_{j}).\tag{11.28}\]
\(\text{ }\)Form of intermediate stages. Note that \(\xi_{k}\) is supposed to approximate \(y(t_{n}+c_{k}h),\) and similarly to ([eq:ch11_ERK_starting_point]) \[y(t_{n}+c_{k}h)-y(t_{n})=\int_{t_{n}}^{t_{n}+c_{k}h}F(t,y(t))dt.\tag{11.29}\] Applying the philosophy of ([eq:ch11_ERK_starting_point])-([eq:ch11_ERK_intrp_final_step]) to the r.h.s. would entail approximating it as a weighted sum of approximations of the solution at points corresponding to some nodes \(\tilde{c}_{1},\tilde{c}_{2},\ldots\) in the interval \([0,c_{k}].\) Since we are anyway aiming to generate the approximations ([eq:ch11_ERK_penultimate_step]) corresponding to the nodes \(c_{i}\) it seems natural to reuse these and to use \(c_{1},\ldots,c_{k-1}\) as nodes when approximating the r.h.s of ([eq:ch11_ERK_penultimate_step]). That is, approximate it as \[y(t_{n}+c_{k}h)-y(t_{n})=\int_{t_{n}}^{t_{n}+c_{k}h}F(t,y(t))dt\approx h\sum_{j=1}^{k-1}a_{k,j}F(t_{n}+c_{j}h,\xi_{j}).\] This suggests to compute \(\xi_{k}\) based on \(y_{n},\xi_{1},\ldots,\xi_{k-1}\) as \[\xi_{k}=y_{n}+h\sum_{j=1}^{k-1}a_{k,j}F(t_{n}+c_{j}h,\xi_{j}).\] Note that while sticking to the same nodes, we do introduce new weights \(a_{k,j}\) that are specific to computing \(\xi_{k}\) and can be optimized for overall accuracy. We can compute the approximation \(\xi_{k-1}\) from the previous \(y_{n},\xi_{1},\ldots,\xi_{k-2}\) in the same way using \[\xi_{k-1}=y_{n}+h\sum_{j=1}^{k-2}a_{k-1,j}F(t_{n}+c_{j}h,\xi_{j}),\] or for general \(1\le l\le k\) \[\xi_{l}=y_{n}+h\sum_{j=1}^{l-1}a_{l,j}F(t_{n}+c_{j}h,\xi_{j}),l=1,\ldots,k.\tag{11.30}\] If the first intermediate approximation is of \(y(t_{n}+ch)\) for \(c>0,\) then since there are no previous intermediate values to exploit the approximation must simply be based only on \(y_{n}.\) It is customary to set \(c_{1}=0\) so that \(\xi_{1}=y_{n}\) and \(\xi_{2}=y_{n}+ha_{2,1}F(t_{n},\xi_{1})\) is this first intermediate approximation. This is consistent with the case \(l=1\) of ([eq:ch11_ERK_intro_gen_xi_step]).
An explicit Runge-Kutta method is thus parameterized by an integer \(k\ge1\) and numbers \(c_{2},\ldots,c_{k},b_{1},\ldots,b_{k},a_{l,j},1\le j<l\le k.\) Given the parameters is takes the form \[\begin{array}{ccl} \xi_{1} & = & y_{n},\\ \xi_{l} & = & y_{n}+h\sum_{j=1}^{l-1}a_{l,j}F(t_{n}+c_{j}h,\xi_{j}),2\le l\le k,\\ y_{n+1} & = & y_{n}+h\sum_{l=1}^{k}b_{l}F(t_{n}+c_{l}h,\xi_{l}), \end{array}\tag{11.31}\] where \(c_{1}=0.\) The methods corresponding to a given \(k\) are called \(k\)-stage methods. One picks “good” values for the parameters \(c_{i},b_{i},a_{l,j}\) by Taylor expanding \(y_{n+1}\) in \(h,\) and matching to the Taylor expansion of the true solution.
\(\text{ }\)One stage \(k=1.\) If \(k=1\) the only intermediate stage is \(\xi_{1}=y_{n}\) and the ERK reads \(y_{n+1}=y_{n}+hb_{1}F(t_{n},y_{n}).\) This is only consistent (i.e. of some order \(p\ge0\)) if \(b_{1}=1,\) which is Euler’s method.
\(\text{ }\)Two stage \(k=2.\) For \(k=2\) stages the general method takes the form \[\begin{array}{ccl} \xi_{1} & = & y_{n},\\ \xi_{2} & = & y_{n}+ha_{2,1}F(t_{n}+c_{2}h,\xi_{1})\\ y_{n+1} & = & y_{n}+hb_{1}F(t_{n},\xi_{1})+hb_{2}F(t_{n}+c_{2}h,\xi_{2}). \end{array}\] To compute the local error of this method, consider its result when \(y_{n}\) is replaced by the true value \(y(t_{n})\): \[\begin{array}{ccl} \tilde{\xi}_{1} & = & y(t_{n})\\ \tilde{\xi}_{2} & = & y(t_{n})+ha_{2,1}F(t_{n}+c_{2}h,\tilde{\xi}_{1})\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}F(t_{n},\tilde{\xi}_{1})+hb_{2}F(t_{n}+c_{2}h,\tilde{\xi}_{2}). \end{array}\] Let us Taylor expand all the quantities in \(h.\) For the second line we obtain \[\tilde{\xi}_{2}=y(t_{n})+ha_{2,1}F(t_{n},y(t_{n}))+h^{2}a_{2,1}\partial_{t}F(t_{n},y(t_{n}))+O(h^{3}).\tag{11.32}\] To lighten the notation let us omit the arguments of \(F\) or its derivatives, if those arguments are \(t=t_{n}\) and \(y=y(t_{n}).\) Let us also write \(\Delta_{2}=\tilde{\xi}_{2}-y(t_{n}).\) Writing ([eq:ch11_ERK_k=00003D2_xi2]) in this notation yields \[\Delta_{2}=ha_{2,1}F+h^{2}a_{2,1}\partial_{t}F+O(h^{3}).\tag{11.33}\] From ([eq:ch11_ERK_k=00003D2_delta2_exp]) we see that \(\Delta_{2}=O(h).\) Using the same conventions, the last row reads \[\tilde{y}_{n+1}-y(t_{n})=hb_{1}F+hb_{2}F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2}).\tag{11.34}\] Only the final term needs to be expanded. Its expansion is \[F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2})=F+c_{2}h\partial_{t}F+\Delta_{2}\partial_{y}F+O(h^{2}).\] Substituting in the r.h.s. of ([eq:ch11_ERK_k=00003D2_delta2_exp]) for \(\Delta_{2}\) this turns into \[F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2})=F+c_{2}h\partial_{t}F+ha_{2,1}F\partial_{t}F+O(h^{2}).\tag{11.35}\] Finally plugging into ([eq:ch11_ERK_k=00003D2_final_line_exp_step]) yields \[\tilde{y}_{n+1}-y(t_{n})=hb_{1}F+hb_{2}\left(F+c_{2}h\partial_{t}F+ha_{2,1}F\partial_{t}F\right)+O(h^{3}),\tag{11.36}\] which can be rearranged to \[\tilde{y}_{n+1}-y(t_{n})=h(b_{1}+b_{2})F+h^{2}b_{2}\left(c_{2}\partial_{t}F+a_{2,1}F\partial_{t}F\right)+O(h^{3}).\tag{11.37}\] We can now compare this to the Taylor expansion of the solution \(y(t)\) around \(t=t_{n}.\) It reads \[y(t_{n+1})-y(t_{n})=hy'(t_{n})+\frac{h^{2}}{2}y''(t_{n})+O(h^{3}).\] We have \[y'(t_{n})=F(t_{n},y(t_{n}))=F,\] and \[y''(t_{n})=\frac{d}{dt}F(t,y(t))|_{t=t_{n}}=\partial_{t}F+y'(t_{n})\partial_{y}F=\partial_{t}F+F\partial_{y}F.\] Thus the expansion is \[y(t_{n+1})-y(t_{n})=hF+\frac{h^{2}}{2}(\partial_{t}F+F\partial_{y}F)+O(h^{3}).\tag{11.38}\] Comparing ([eq:ch11_ERK_k=00003D2_set_expansion_final]) and ([eq:ch11_ERK_y_exp_up_to_h^2]), these match up to order \(h^{2}\) provided \[b_{1}+b_{2}=1\quad\quad b_{2}c_{2}=\frac{1}{2}\quad\quad b_{2}a_{2,1}=\frac{1}{2}.\tag{11.39}\] If these conditions are satisfied, then \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{3}).\]
\(\text{ }\)Order of ERK. As for multistep methods, we say that an ERK method that produces the approximation \(\tilde{y}_{n+1}\) from \(y_{n}=y(t_{n})\) (the true solution at time \(t=t_{n}\)) is of order (at least) \(p\) if for any IVP with analytic \(F\) it holds that \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{p+1}).\] Methods with parameters satisfying ([eq:ch11_ERK_k=00003D2_order_conditions]) are thus of order (at least) \(p=2.\)
\(\text{ }\)RK tableaux. The conditions ([eq:ch11_ERK_k=00003D2_order_conditions]) are satisfied for instance if \(b_{1}=0,b_{2}=1,a_{2,1}=c_{2}=\frac{1}{2}.\) But also if \(c_{2}=a_{2,1}=\frac{2}{3},b_{1}=\frac{1}{4},b_{2}=\frac{3}{4},\) or if \(b_{1}=b_{2}=\frac{1}{2},a_{2,1}=c_{2}=1.\) The parameters of a RK method are often recorded in a table called a tableau, which for \(k=2\)-stage explicit methods take the form \[\begin{array}{c|cc} 0\\ c_{2} & a_{2,1}\\ \hline & b_{1} & b_{2} \end{array},\] so that the three methods mention above correspond to the tableaux \[\begin{array}{c|cc} 0\\ \frac{1}{2} & \frac{1}{2}\\ \hline & 0 & 1 \end{array},\quad\quad\text{and}\quad\quad\begin{array}{c|cc} 0\\ \frac{2}{3} & \frac{2}{3}\\ \hline & \frac{1}{4} & \frac{3}{4} \end{array},\quad\quad\text{and}\quad\quad\begin{array}{c|cc} 0\\ 1 & 1\\ \hline & \frac{1}{2} & \frac{1}{2} \end{array}\tag{11.40}\] respectively. The general \(k=3\)-stage explicit RK method takes the form \[\begin{array}{c|ccc} 0\\ c_{2} & a_{2,1}\\ c_{3} & a_{3,1} & a_{3,2}\\ \hline & b_{1} & b_{2} & b_{3} \end{array},\tag{11.41}\] and for any number of stages \(k\ge1\) the tableau takes the form \[\begin{array}{c|c} c & A\\ \hline & b \end{array},\] where we view \(c=(c_{1},\ldots,c_{k})\) as a row vector, \(b=(b_{1},\ldots,b_{k})\) as column vector and the Runge-Kutta (RK) matrix \(A\) is the \(k\times k\) matrix \(A_{i,j}=a_{i,j}.\) An explicit RK method has \(1c=0\) and \(A_{l,i}=0\) unless \(i<l,\) i.e. its RK matrix \(A\) is strictly lower triangular.
\(\text{ }\)Order conditions. The conditions ([eq:ch11_ERK_k=00003D2_order_conditions]) obtained by matching coefficients in the Taylor expansion to achieve a certain order are fittingly called order conditions. Since ([eq:ch11_ERK_k=00003D2_order_conditions]) ensure order \(p=2,\) they are called the second-order conditions. Deriving the order conditions by brute force quickly becomes very tedious. Already the computation of the third-order conditions corresponding to \(p=3\) is too involved to present here. Instead, let us consider a simplified setting where the ODE is autonomous, i.e. \(F(t,y)\) is a function only of \(y.\) For such a \(F\) all the partial derivatives with respect to \(t\) vanish, greatly reducing the number of terms in the expansion. As it happens, the order conditions one obtains are the same as for a general \(F.\) We won’t prove this fact, which also does not extend to \(p=4,5,\ldots.\).
\(\text{ }\)Third-order conditions. Consider the general explicit method with \(k=3\) stages specified by the tableau ([eq:ch11_ERK_intro_general_k=00003D3_stage_tableau]). Applied with the exact value \(y(t_{n})\) in place of \(y_{n}\) when \(F\) is autonomous it yields \[\begin{array}{lcl} \tilde{\xi}_{1} & = & y(t_{n}),\\ \tilde{\xi}_{2} & = & y(t_{n})+ha_{2,1}F(\tilde{\xi}_{1}),\\ \tilde{\xi}_{3} & = & y(t_{n})+ha_{3,1}F(\tilde{\xi}_{1})+ha_{2,1}F(\tilde{\xi}_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}F(\tilde{\xi}_{1})+hb_{2}F(\tilde{\xi}_{2})+hb_{3}F(\tilde{\xi}_{3}), \end{array}\tag{11.42}\] We extend the expansion of the solution ([eq:ch11_ERK_y_exp_up_to_h^2]) to third order by noting that \[y'''(t_{n})=\frac{d^{2}}{dt^{2}}F(y(t))|_{t=t_{n}}=\frac{d}{dt}F'(y(t))y'(t)|_{t=t_{n}}=\frac{d}{dt}F'(y(t))F(y(t))|_{t=t_{n}}=F''F^{2}+(F')^{2}F,\] so \[y(t_{n+1})=y(t_{n})+hF+\frac{h^{2}}{2}FF'+\frac{h^{3}}{3!}\left(F''F^{2}+(F')^{2}F\right)+\ldots.\tag{11.43}\] Writing \(\Delta_{i}=y(t_{n})+\tilde{\xi}_{i}\) and using the convention that \(F(y(t_{n}))\) is written simply as \(F,\) this turns into \[\begin{array}{lcl} \Delta_{2} & = & ha_{2,1}F,\\ \Delta_{3} & = & ha_{3,1}F+ha_{2,1}F(y(t_{n})+\Delta_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}+hb_{2}F(y(t_{n})+\Delta_{2})+hb_{3}F(y(t_{n})+\Delta_{3}). \end{array}\] The rest of the computation is an exercise on sheet 11. The result is the third-order conditions for \(k=3\) stage methods \[b_{1}+b_{2}+b_{3}=1,\quad b_{2}c_{2}+b_{3}c_{3}=\frac{1}{2},\quad b_{2}c_{2}^{2}+b_{3}c_{3}^{2}=\frac{1}{3},\quad b_{3}a_{3,2}c_{2}=\frac{1}{6}.\tag{11.44}\] Common \(k=3\)-stage methods (that satisfy ([eq:ch11_ERK_p=00003D3,k=00003D3_order_conditions])) include \[\begin{array}{c|ccc} 0\\ \frac{1}{2} & \frac{1}{2}\\ 1 & -1 & 2\\ \hline & \frac{1}{6} & \frac{2}{3} & \frac{1}{6} \end{array}\quad\quad\text{and}\quad\quad\begin{array}{c|ccc} 0\\ \frac{2}{3} & \frac{2}{3}\\ \frac{2}{3} & 0 & \frac{2}{3}\\ \hline & \frac{1}{4} & \frac{3}{8} & \frac{3}{8} \end{array}.\tag{11.45}\] \(\text{ }\)Four-stage method. A common method \(k=4\)-stage method is \[\begin{array}{c|cccc} 0\\ \frac{1}{2} & \frac{1}{2}\\ \frac{1}{2} & 0 & \frac{1}{2}\\ 1 & 0 & 0 & 1\\ \hline & \frac{1}{6} & \frac{1}{3} & \frac{1}{3} & \frac{1}{6} \end{array},\tag{11.46}\] which is of order \(p=4.\)
\(\text{ }\)Order barriers. The \(k=2\)-stage methods ([eq:ch11_ERK_common_two_stage_methods]) are of order \(p=2,\) the \(k=3\)-stage methods ([eq:ch11_ERK_common_three_stage]) are of order \(p=3,\) and the \(k=4\)-stage method ([eq:ch11_ERK_common_four_stage]) is of order \(p=4.\) But the pattern does not continue: there is no explicit Runge-Kutta method with \(k=5\) stages which is of order \(p=5\)! Instead at least \(k=6\) stages are required to obtain a method of order \(p=5.\) The following table shows the highest achievable order for ERK methods with \(k=1,2,\ldots,9\) stages.
| # stages \(k\) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| Highest order | 1 | 2 | 3 | 4 | 4 | 5 | 6 | 6 | 7 |
From p. 154 in Section II.2 of “Solving Ordinary Differential Equations I” by Hairer, Nørsett and Wanner.
| Order \(p\) to verify | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| # order conditions | 1 | 2 | 4 | 8 | 17 | 37 | 85 | 200 | 486 | 1205 |
Studying the order of general RK methods is challenging. There exists a combinatorial theory based on a correspondence between RK methods and trees (in the graph theoretic sense), which was used to compute the information in the tables. It is beyond the scope of this course. The last section of this chapter studies a special subclass of implicit Runge-Kutta methods whose order can be analyzed using the quadrature theory of the previous section.
11.3 Implicit Runge-Kutta (IRK) methods
An implicit Runge-Kutta (IRK) method with \(k\) stages takes the form \[\begin{array}{ccl} \xi_{j} & = & y_{n}+h\sum_{i=1}^{k}a_{j,i}F(t_{n}+c_{i}h,\xi_{i}),\quad j=1,\ldots,k,\\ y_{n+1} & = & y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}), \end{array}\tag{11.47}\] which generalizes ([eq:ch11_ERK_general_form]) by allowing the approximation \(\xi_{j}\) of each stage depend on all other - also later - stages. Correspondingly the RK matrix \(A\) is arbitrary. We no longer require \(c_{1}\ne0.\)
An implementation of an IRK method must approximate a solution to the system of \(k\) non-linear equations in the top line of ([eq:ch11_IRK_general_form]), which are in addition \(\mathbb{R}^{d}\)-valued for an \(\mathbb{R}^{d}\)-valued ODE. This can be rather costly in terms of computation. But IRK methods can also be more stable, and achieve higher order for the same number of stages than an ERK, so sometimes the trade-off is worth it.
The Backward Euler method is an IRK with \(k=1\) stages and tableau \[\begin{array}{c|c} 1 & 1\\ \hline & 1 \end{array}.\]
\(\text{ }\)Order of IRK methods. Let \(\tilde{y}_{n+1}\) denote the value of the last line of ([eq:ch11_IRK_general_form]) after replacing \(y_{n}\) with the true value \(y(t_{n})\) in the equations ([eq:ch11_IRK_general_form]), and assume that for each \(h>0\) small enough the intermediate values \(\xi_{1},\ldots,\xi_{k}\) are an exact solution of the (modified) top line of ([eq:ch11_IRK_general_form]). If the bound \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{p+1})\] holds for all analytic \(F\) we say that the IRK method is of order (at least) \(p\).
Consider the two stage IRK method whose tableau is \[\begin{array}{c|cc} 0 & \frac{1}{4} & -\frac{1}{4}\\ \frac{2}{3} & \frac{1}{4} & \frac{5}{12}\\ \hline & \frac{1}{4} & \frac{3}{4} \end{array}.\tag{11.48}\] Written out as equations it corresponds to \[\begin{array}{lcl} \xi_{1} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})-h\frac{1}{4}F(t_{n}+\frac{2}{3}h,\xi_{2}),\\ \xi_{2} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})+h\frac{5}{12}F(t_{n}+\frac{2}{3}h,\xi_{2}),\\ y_{n+1} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})+h\frac{3}{4}F(t_{n}+\frac{2}{3}h,\xi_{2}). \end{array}\] Let use verify that it has order at least \(p=3,\) under the simplifying assumption that \(F\) is autonomous. We replace \(y_{n}\) with the exact value \(y(t_{n}),\) yielding \[\begin{array}{lll} \tilde{\xi}_{1} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})-h\frac{1}{4}F(\tilde{\xi}_{2}),\\ \tilde{\xi}_{2} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})+h\frac{5}{12}F(\tilde{\xi}_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})+h\frac{3}{4}F(\tilde{\xi}_{2}). \end{array}\] We now expand \(\tilde{y}_{n+1}\) in \(h\) up to order \(h^{3},\) and compare to the expansion of \(y(t_{n+1})\) up to order \(h^{3}\) for autonomous \(F,\) which was already carried out in ([eq:ch11_ERK_y_aut_exp_to_order_3]). Writing \(y=y(t_{n})\) and \(\Delta_{i}=\tilde{\xi}_{i}-y\) the equations turn into \[\begin{array}{lll} \Delta_{1} & = & \frac{h}{4}F(y+\Delta_{1})-h\frac{1}{4}F(y+\Delta_{2}),\\ \Delta_{2} & = & \frac{h}{4}F(y+\Delta_{1})+h\frac{5}{12}F(y+\Delta_{2}),\\ \tilde{y}_{n+1}-y & = & \frac{h}{4}F(y+\Delta_{1})+h\frac{3}{4}F(y+\Delta_{2}). \end{array}\] The top lines imply that \(\Delta_{1},\Delta_{2}=O(h).\) Using the expansions \[F(y+\Delta_{i})=F+\Delta_{i}F'+\frac{\Delta_{i}^{2}}{2}F''+O(h^{3})\] the bottom line turns into \[\tilde{y}_{n+1}-y=hF+h\left(\frac{1}{4}\Delta_{1}+\frac{3}{4}\Delta_{2}\right)F'+h\left(\frac{1}{8}\Delta_{1}^{2}+\frac{3}{8}\Delta_{2}^{2}\right)F''+O(h^{4}).\tag{11.49}\] We see that we need the expansion of \(\Delta_{1}\) and \(\Delta_{2}\) up to order \(h^{2}\) to be able to determine all terms up to order \(h^{3}\) in the bottom row. Using the expansions \[F(y+\Delta_{i})=F+\Delta_{i}F'+O(h^{2})\] in the top two lines we obtain \[\begin{array}{lll} \Delta_{1} & = & \frac{h}{4}(\Delta_{1}-\Delta_{2})F'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h\left(\frac{1}{4}\Delta_{1}+\frac{5}{12}\Delta_{2}\right)F'+O(h^{3}). \end{array}\] The top line implies that \(\Delta_{1}=O(h^{2}),\) which allows us to eliminate it from the r.h.s.s. of the top two lines, yielding \[\begin{array}{lll} \Delta_{1} & = & -\frac{h}{4}\Delta_{2}F'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h\frac{5}{12}\Delta_{2}F'+O(h^{3}). \end{array}\] Substituting \(\Delta_{2}=h\frac{2}{3}F+O(h^{2})\) to eliminate all \(\Delta_{2}\) on the r.h.s.s. yields \[\begin{array}{lll} \Delta_{1} & = & -\frac{h^{2}}{6}FF'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h^{2}\frac{5}{18}FF'+O(h^{3}). \end{array}\] Plugging these into the bottom line ([eq:ch11_IRK_example_bottom_line_after_sub]) gives \[\tilde{y}_{n+1}-y=hF+h\left(-\frac{h^{2}}{6}FF'\frac{1}{4}+\frac{3}{4}\left(h\frac{2}{3}F+h^{2}\frac{5}{18}FF'\right)\right)F'+h\frac{3}{8}\left(h\frac{2}{3}F\right)^{2}F''+O(h^{4})\] Collecting terms we get \[\tilde{y}_{n+1}-y=hF+\frac{h^{2}}{2}FF'+h^{3}K+O(h^{4}),\] for \[K=-\frac{1}{6}FF'F'\frac{1}{4}+\frac{3}{4}\frac{5}{18}FF'F'+\frac{3}{8}\left(\frac{2}{3}F\right)^{2}F''.\] The last expression simplifies to \(K=\frac{F(F')^{2}+F^{2}F''}{6},\) so \[\tilde{y}_{n+1}-y=hF+\frac{h^{2}}{2}FF'+h^{3}\frac{F(F')^{2}+F^{2}F''}{6}+O(h^{4}).\] This matches ([eq:ch11_ERK_y_aut_exp_to_order_3]) up to order \(h^{3},\) so we have shown that \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{4})\] for every analytic autonomous \(F.\) The method is thus of order \(p=3\) “for autonomous \(F"\) - in reality is of order \(p=3\) in the proper sense, though we refrain from proving it.
11.4 Collocation methods
Collocation methods are a subclass of IRK’s that have a more concrete quantitative connection to quadrature, and whose order can be analyzed using the quadrature results of Section 11.1.
Each \(k\ge1\) and choice of distinct \(c_{1},\ldots,c_{k}\) gives rise to a unique collocation method, i.e. uniquely specifies the RK matrix \(A\) and the RK weights \(b_{1},\ldots,b_{k}\) of a \(k\)-stage IRK. The method can be defined in two equivalent ways: either in terms of polynomial approximation, or as the IRK with certain parameters.
\(\text{ }\)Polynomial interpolation definition. The polynomial approximation description starts with \(k\ge1\) and distinct \(c_{1},\ldots,c_{k}.\) It then defines the next approximation \(y_{n+1}\) of \(y(t_{n+1})\) from an already computed approximation \(y_{n}\) of \(y(t_{n})\) in terms of a polynomial \(u\) of degree at most \(k\) such that \[\begin{array}{l} u(t_{n})=y_{n},\\ u'(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,u(t_{n}+c_{l}h)),\quad1\le l\le k. \end{array}\tag{11.51}\] Note that the second line of ([eq:ch11_colloc_def]) requires that the polynomial \(u\) satisfies the ODE condition \(u'(t)=F(t,u(t))\) not at all \(t\in[t_{n},t_{n+1}],\) but rather only for the \(k\) collocation points \(t=t_{n}+c_{l}h.\) The collocation method then sets \[y_{n+1}=u(t_{n+1}).\tag{11.52}\] From its degree \(k\) we see that the number of degrees of freedom in picking \(u\) is \(k+1.\) On the other hand ([eq:ch11_colloc_def]) represents \(k+1\) constraints on \(u.\) Since the number of degrees of freedom and number of constraints match, one may expect that there is a unique \(u\) that satisfies ([eq:ch11_colloc_def]) (this heuristic argument is far from a proof however, since the constraints of the bottom row of ([eq:ch11_colloc_def]) are non-linear equations when \(F\) is non-linear).
\(\text{ }\)Definition as IRK. For any \(k\ge1,c_{1},\ldots,c_{k}\in\mathbb{R}\) consider the polynomials \[p_{i}(t)=\prod_{j\ne i}\frac{t-c_{j}}{c_{i}-c_{j}},\quad1\le i\le k,\tag{11.53}\] (which are the basis polynomials for Lagrange interpolating polynomials with nodes given by the \(c_{i}\)). The collocation method corresponding to \(c_{1},\ldots,c_{k}\) turns out to be equivalent to the IRK \[\begin{array}{l} \xi_{l}=y_{n}+h\sum_{i=1}^{k}a_{l,i}F(t_{n}+c_{i}h,\xi_{i}),\quad l=1,\ldots,k,\\ y_{n+1}=y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}), \end{array}\tag{11.54}\] for \[a_{l,i}=\int_{0}^{c_{l}}p_{i}(t)dt\quad\quad\text{and}\quad\quad b_{i}=\int_{0}^{1}p_{i}(t)dt\quad\quad\text{for}\quad\quad1\le i,l\le k,\tag{11.55}\] as proved by the following lemma.
Let \(k\ge1\) and fix distinct \(c_{1},\ldots,c_{k}\in\mathbb{R}.\) Let \(a_{l,i},b_{i}\) be as in ([eq:ch11_colloc_IRK_def_a_b]). Let \(n\ge0\) and \(y_{n}\in\mathbb{R}^{d}.\)
a) If \(\xi_{1},\ldots,\xi_{k}\in\mathbb{R}^{d}\) satisfy the first line of ([eq:ch11_colloc_IRK_def]) (i.e. if \(\xi_{l}\) are the result of the intermediate stages of the IRK corresponding ([eq:ch11_colloc_IRK_def_a_b]) going from \(t_{n}\) to \(t_{n+1}\)), then there exists a polynomial \(u(t)\) of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]) and \[u(t_{n+1})=y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}),\tag{11.56}\] (i.e. the collocation method corresponding to ([eq:ch11_colloc_def])-([eq:ch11_colloc_def_y_n+1]) going from \(t_{n}\) to \(t_{n+1}\) produces the same \(y_{n+1}\) as the IRK ([eq:ch11_colloc_IRK_def])).
b) Conversely, if \(u(t)\) is a polynomial of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]) then there exist \(\xi_{1},\ldots,\xi_{k}\in\mathbb{R}^{d}\) that satisfy the first line of ([eq:ch11_colloc_IRK_def]) and ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]) (i.e. if the collocation method ([eq:ch11_colloc_def])-([eq:ch11_colloc_def_y_n+1]) yields the approximation \(u(t_{n+1})\) of \(y(t_{n+1})\) when going from \(t_{n}\) to \(t_{n+1},\) then the IRK ([eq:ch11_colloc_IRK_def]) also approximates \(y(t_{n+1})\) by \(u(t_{n+1})\)).
Proof. Note that the \(a_{l,i},b_{i}\) of ([eq:ch11_colloc_IRK_def_a_b]) can equivalently be defined via \[ha_{l,i}=\int_{t_{n}}^{t_{n}+c_{l}h}p_{i}\left(\frac{t-t_{n}}{h}\right)dt\quad\quad\text{and}\quad\quad hb_{i}=\int_{t_{n}}^{t_{n+1}}p_{i}\left(\frac{t-t_{n}}{h}\right)dt,\tag{11.57}\] (the equivalence follows from the change of variables \(t\to\frac{t-t_{n}}{h}\)).
a) Using the characterization ([eq:ch11_colloc_equiv_lemma_ha_hb]) of \(a_{l,i}\) it follows from ([eq:ch11_colloc_IRK_def]) that \[\xi_{l}=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i})dt,\quad l=1,\ldots,k.\] In other words \[r(t)=\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i}),\tag{11.58}\] satisfies \[\xi_{l}=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}r(t)dt,\quad l=1,\ldots,k.\tag{11.59}\] The function \(r(t)\) is a polynomial of degree at most \(k-1,\) and is the Lagrange interpolating polynomial satisfying \[r(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,\xi_{l}),\quad l=1,\ldots,k.\tag{11.60}\] The integral \[u(t):=y_{n}+\int_{t_{n}}^{t}r(s)ds\tag{11.61}\] is a polynomial of degree at most \(k,\) trivially satisfies the first line of ([eq:ch11_colloc_def]), and also satisfies \[u(t_{n}+c_{l}h)=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}r(s)ds=\xi_{l},\tag{11.62}\] (by ([eq:ch11_colloq_equiv_lem_xi_l_in_terms_of_r(t)]) to ([eq:ch11_colloq_equiv_lem_u_def_from_r])) and therefore \[u'(t_{n}+c_{l}h)=r(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,\xi_{l})=F(t_{n}+c_{l}h,u(t_{n}+c_{l}h)),\tag{11.63}\] (from ([eq:ch11_colloq_equiv_lem_r_interpolating_vals])-([eq:ch11_colloq_equiv_lem_u_prime_from_F])). Thus \(u\) is a polynomial of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]). Finally we have \[\begin{array}{ccccl} u(t_{n+1}) & = & y_{n}+\int_{t_{n}}^{t_{n+1}}r(t)dt & = & y_{n}+\sum_{i=1}^{k}\left(\int_{t_{n}}^{t_{n+1}}p_{i}(\frac{t-t_{n}}{h})dt\right)F(t_{n}+c_{i}h,\xi_{i})\\ & & & = & y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}) \end{array}\] (from ([eq:ch11_colloq_equiv_lem_u_def_from_r]), ([eq:ch11_colloq_equiv_lem_r(t)_def]) and ([eq:ch11_colloc_equiv_lemma_ha_hb])). Thus \(u\) also satisfies ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]).
b) Given a polynomial \(u\) of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]), define \[\xi_{l}:=u(t_{n}+c_{l}h),\quad l=1,\ldots,k.\tag{11.64}\] The derivative \(u'(t)\) is a polynomial of degree at most \(k-1\) such that \[u'(t_{n}+c_{i}h)=F(t_{n}+c_{i}h,\xi_{i})\quad,i=1,\ldots,k,\] (from the second line of ([eq:ch11_colloc_def]) and ([eq:ch11_colloq_equiv_lem_xi_from_u])). But there is only one such polynomial (namely the Lagrange interpolating polynomial) so in fact \[u'(t)=\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i}).\] This implies that \[u(t)-u(t_{n})=\sum_{i=1}^{k}\left(\int_{t_{n}}^{t}p_{i}\left(\frac{s-t_{n}}{h}\right)ds\right)F(t_{n}+c_{i}h,\xi_{i})\tag{11.65}\] for all \(t.\) The case \(t=t_{n}+c_{l}h\) of ([eq:ch11_colloq_equiv_lem_u_from_integral]) yields \[\xi_{l}-y_{n}=h\sum_{i=1}^{k}a_{l,i}F(t_{n}+c_{i}h,\xi_{i}),\quad l=1,\ldots,k,\] (using ([eq:ch11_colloq_equiv_lem_xi_from_u]), the first line of ([eq:ch11_colloc_def]) and ([eq:ch11_colloc_equiv_lemma_ha_hb])), which is a restatement of the first line of ([eq:ch11_colloc_IRK_def]). Similarly the case \(t=t_{n+1}\) of ([eq:ch11_colloq_equiv_lem_u_from_integral]) yields ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]).
We emphasize that only some IRK methods are also collocation method - most IRK methods have no collocation interpretation. The only advantage of IRK methods that are also collocation methods is that it is easier to rigorously prove results about their order, which is the goal of this section.
\(\text{ }\)Order of collocation methods. To do so we will use the following remarkable formula concerning functions \(v(t)\) that are “almost” solutions of an IVP. One might say that the IVP \(y'(t)=F(t,y(t)),t\ge t_{0},y(t_{0})=y_{0},\) is “almost” solved by any \(v(t)\) such that \(v'(t)\approx F(t,v(t)),t\ge t_{0},v(t_{0})=y_{0},\) and one might hope that such a \(v(t)\) is a good approximation of \(y(t).\) The identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) expresses the error \(y(t)-v(t)\) in the latter approximation in terms of the error \(v'(t)-F(t,v(t))\) in the former.
Let \(d\ge1\) and let \(y'(t)=F(t,y(t)),t\ge t_{0},y(t_{0})=y_{0}\) be an \(\mathbb{R}^{d}\)-valued IVP for a smooth \(F\) with smooth solution \(y(t).\) Then for any smooth \(v(t)\) with \(v(t_{0})=y_{0},\) there exists a \(\mathbb{R}^{d\times d}\)-valued smooth function \(\Phi(t,y)\) such that \[v(t)-y(t)=\int_{t_{0}}^{t}\Phi(t-s,v(t-s))(v'(s)-F(t,v(s))ds,\quad\quad t\ge t_{0}.\tag{11.67}\]
Let \(I\subset\mathbb{R}\) be an interval and let \(a,b:I\to\mathbb{R}\) be continuous and Lebesgue integrable on \(I.\) A function \(y:I\to\mathbb{R}\) is a solution to the IVP \[\dot{y}(t)=a(t)y(t)+b(t),\quad\quad t\in I,\quad\quad y(t_{0})=y_{0},\] iff \[y(t)=y_{0}e^{\int_{t_{0}}^{t}a(s)ds}+\int_{t_{0}}^{t}e^{\int_{r}^{t}a(s)ds}b(r)dr,\quad\quad t\in I.\]
The formula ([eq:ch11_colocation_alekseev_gr=0000F6bner]) can be interpreted in a similar way: the error in the approximation \(v'(s)\approx F(t,v(s))\) at time \(s\) is \(v'(s)-F(s,v(s)),\) and this error produces an “infinitesimal additive increment” to the error \(v-y\) of about \((v'(s)-F(s,v(s)))\varepsilon\) in the interval \([s,s+\varepsilon],\varepsilon\downarrow0,\) which is “projected forward” by the factor \(\Phi(t-s,v(t-s))\) so that the resulting contribution at time \(t>s\) is essentially \(\Phi(t-s,v(t-s))(v'(s)-F(t,v(s))\varepsilon.\)
We omit the full proof of ([eq:ch11_colocation_alekseev_gr=0000F6bner]), and instead only make it plausible verifying it in the special case of \(F(t,y)\) autonomous and linear in \(y.\) For such \(F\) the identity is with an explicit \(\Phi\) is derived by direct computation.
a) Let \(d\ge1\) and let \(A\) be a \(d\times d\) real- or complex-valued matrix, \(I\subset\mathbb{R}\) a non-empty interval and \(t\in I\) and \(x_{0}\in\mathbb{R}^{d}\) resp. \(\mathbb{C}^{d}.\) The unique solution of the constrained ODE \[\dot{x}(t)=Ax(t),\quad\quad t\in I,\quad\quad x(t_{0})=x_{0},\tag{7.38}\] is \[x(t)=\exp\left((t-t_{0})A\right)x_{0}.\] b) If furthermore \(A\) is invertible and \(b\in\mathbb{R}^{d}\) resp. \(\mathbb{C}^{d}\) then the unique solution of the constrained ODE \[\dot{y}(t)=Ay(t)+b,\quad\quad t\in I,\quad\quad y(t_{0})=y_{0},\tag{7.39}\] is \[y(t)=\exp\left((t-t_{0})A\right)(y_{0}+A^{-1}b)-A^{-1}b.\]
Let \(I\subset\mathbb{R}\) be an interval and let \(a,b:I\to\mathbb{R}\) be continuous and Lebesgue integrable on \(I.\) A function \(y:I\to\mathbb{R}\) is a solution to the IVP \[\dot{y}(t)=a(t)y(t)+b(t),\quad\quad t\in I,\quad\quad y(t_{0})=y_{0},\] iff \[y(t)=y_{0}e^{\int_{t_{0}}^{t}a(s)ds}+\int_{t_{0}}^{t}e^{\int_{r}^{t}a(s)ds}b(r)dr,\quad\quad t\in I.\]
Let \(\mathbb{F}=\mathbb{R}\) or \(\mathbb{C}.\) Let \(d\ge1,\) and let \(I=[t_{0},t_{1}]\subset\mathbb{R}\) be a non-empty closed interval, and \[A:I\to\mathbb{F}^{d\times d}\quad\quad\quad\text{ and }\quad\quad\quad b:I\to\mathbb{F}^{d}\] be continuous functions. Then for any \(t_{*}\in I\) and \(y_{*}\in\mathbb{F}^{d}\) there exists a solution to \[\dot{y}(t)=A(t)y(t)+b(t),\quad\quad t\in I,\quad\quad y(t_{*})=y_{*}.\]
It follows directly from ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_y_IVP_sol]) and ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_v_IVP_sol]) that \[v(t)-y(t)=\int_{t_{0}}^{t}e^{(t-s)A}\Delta(s)ds.\] Defining \(\Phi(s,v)=e^{sA},\) which is smooth, the r.h.s. equals \[\int_{t_{0}}^{t}\Phi(t-s,v(t-s))\Delta(s)ds,\] which yields the identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) in this special case.
If a non-empty interval \([a,b].\) Let \(k\ge1.\) If \(c_{1},\ldots,c_{k}\) are the roots of \(P_{k}\) and \(b_{1},\ldots,b_{k}\) the unique solutions of ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) then:
The corresponding quadrature is of order \(2k.\)
No other quadrature has higher order.
Let \(k\ge1\) and let \(c_{1},\ldots,c_{k}\) be the distinct roots of the \(k\)-th orthogonal polynomial \(P_{k}\) from ([eq:ch11_guass_quad_legendre_poly_def]) for the interval \([0,1].\) Then then the collocation method ([eq:ch11_colloc_def]) is of order exactly \(2k.\)
Fix a quadrature rule \(k,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}.\) Let \(p\ge1.\)
a) If the quadrature rule is of order \(p\) then \[\left|\int_{a}^{b}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le c\sup_{t\in[a,b]}|f^{(p)}(t)|\:\:\text{for all analytic }f:[a,b]\to\mathbb{R},\tag{11.12}\] where \(c=(k+1)\frac{(b-a)^{p}}{p!}.\)
b) Conversely, assume that that ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds for some constant \(c\in[0,\infty).\) Then the quadrature rule is of order (at least) \(p.\)
c) Let \(m\ge1.\) The quadrature rule is of order exactly \(m\) iff ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds when \(p=m\) for some constant \(c\in[0,\infty),\) but does not hold for any constant \(c\in[0,\infty)\) when \(p=m+1.\)
The proof that the method is not of order \(2k+1\) or higher is an exercise on sheet 11.
The collocation methods with \(c_{i}\) given by the roots of the Legendre polynomial \(P_{k}\) on \([0,1]\) are called Gauss-Legendre (Runge-Kutta) methods. . Recalling ([eq:ch11_gauss_quad_legendre_polys_for_a_b_from_standard]) the roots \(c_{1},\ldots,c_{k}\) are related the roots \(\tilde{c}_{1},\ldots,\tilde{c}_{k}\) of the Legendre polynomial \(Q_{k}\) for the interval \([-1,1]\) by \[c_{i}=\frac{\tilde{c}_{i}+1}{2}.\]
(\(k=2\)). For \(k=1\) the Legendre polynomial for \([-1,1]\) is \[Q_{1}(t)=t\] (recall ([eq:ch11_gauss_quad_legendre_polys_-1,1])) with root \(\tilde{c}_{1}=0,\) so \(P_{1}\) has root \(c_{1}=\frac{1}{2}.\) We have \(p_{1}(t)=1\) so \[a_{1,1}=\int_{0}^{c_{1}}dt=c_{1}=\frac{1}{2}\] and \[b_{1}=\int_{0}^{1}dt=1.\] Thus the Gauss-Legendre method for \(k=1\) has the tableau \[\begin{array}{c|c} \frac{1}{2} & \frac{1}{2}\\ \hline & 1 \end{array}.\] It is a method of order \(2.\)
(\(k=2\)). For \(k=2\) the computation yields the tableau \[\begin{array}{c|cc} \frac{1}{2}-\frac{1}{2\sqrt{3}} & \frac{1}{4} & \frac{1}{2}-\frac{1}{2\sqrt{3}}\\ \frac{1}{2}+\frac{1}{2\sqrt{3}} & \frac{1}{4}+\frac{1}{2\sqrt{3}} & \frac{1}{4}\\ \hline & \frac{1}{2} & \frac{1}{2} \end{array}.\tag{11.75}\] It is a method of order \(4.\)
(\(k=3\)). For \(k=3\) the computation yields the tableau \[\begin{array}{c|ccc} \frac{1}{2}-\frac{\sqrt{15}}{10} & \frac{5}{36} & \frac{2}{9}-\frac{\sqrt{15}}{15} & \frac{5}{36}-\frac{\sqrt{15}}{30}\\ \frac{1}{2} & \frac{5}{36}+\frac{\sqrt{15}}{24} & \frac{2}{9} & \frac{5}{36}-\frac{\sqrt{15}}{24}\\ \frac{1}{2}+\frac{\sqrt{15}}{10} & \frac{5}{36}+\frac{\sqrt{15}}{30} & \frac{2}{9}+\frac{\sqrt{15}}{15} & \frac{5}{36}\\ \hline & \frac{5}{18} & \frac{4}{9} & \frac{5}{18} \end{array}.\] The method is of order \(6.\)
11.5 Global convergence
In contrast to multistep methods, every Runge-Kutta method of order \(p\) is (globally) convergent of order \(p,\) meaning that it is convergent and the global error is of order \(h^{p}\) for IVPs with analytic \(F.\) We record this in the next theorem, whose proof we omit.
Consider a general \(k\)-stage Runge-Kutta method ([eq:ch11_IRK_general_form]). If the method is of order \(p,\) then for any analytic \(F\) and IVP \(y'(t)=F(t,y(t)),t\in[t_{0},t_{\max}],y(t_{0})=y_{0}\) with a unique analytic solution \(\tilde{y}(t)\) the approximations \(y_{n},n\ge0,\) satisfy the global error estimate \[\max_{n:t_{n}\le t_{\max}}\left|y_{n}-\tilde{y}(t_{n})\right|=O(h^{p}).\]