Formal definition of differential equation, IVP and solution to an IVP

definitioninitial-value-problemsordinary differential equations

The lecture notes I am using first give a formal definition of differential equation, IVP and solution to an IVP. The original lecture notes are in German, so I give the English translation.

Definition (Differential Equation):

(a) Let $d,k,n \in \mathbb{N}$, $I \subset \mathbb{R}$ be an interval, $D \subset I×(\mathbb{R}^{d})^{n+1}$ a set and $F:D \to \mathbb{R}^{k}$ a function. Then $F(t,y(t),y'(t),y''(t),…,y^{(n)}(t))=0, t \in I$ is called an ordinary differential equation (ODE) and we are looking for a function $y:I \to$ $\mathbb{R}^{d}$ that solves $F$.

(b) The number $n$ is the exponent of the highest order derivative of
$y$ and is called the order of the ODE.

(c) We say that an ODE is

• scalar if $d=1$,

• autonomomous if $F$ does not depend on the first variable ($t$) and

• explicit if the equation can be solved for the highest order derivative of $F$, i.e. if it can be written in the form $y^{(n)}(t)=\tilde{F}(t,y(t),y'(t),y''(t),…,y^{(n-1)}(t))$ where $\tilde{F}:\tilde{D} \to \mathbb{R}^{d}$ and $\tilde{D} \subset I×(\mathbb{R}^{d})^{n}$.

Definition (Initial value problem of an explicit ODE):

Let $d,n \in \mathbb{N}$, $I \subset \mathbb{R}$ be an interval, $D \subset I×(\mathbb{R}^{d})^{n}$ a set and $\tilde{F}:D \to \mathbb{R}^{d}$ a function. Given $t_{0} \in I$ and $y_{0},y_{1},…y_{n-1} \in \mathbb{R}^{d}$ we call

$\left\{\begin{array}{}
y^{(n)}(t)=\tilde{F}(t,y(t),y'(t),y''(t),…,y^{(n-1)}(t)), t \in I\\
y^{(j)}(t_{0})=y_{j}, j=0,1,…n-1\\ \end{array} \right.$

an initial value problem (IVP) and $y_{0},y_{1},…y_{n-1}$ initial
values.

The formal definition of a solution of an ODE is only stated for first order equations, i.e.

$y'(t)=f(t,y(t)), t \in I$

and the corresponding IVP

$\left\{\begin{array}{}
y'(t)=f(t,y(t)), t \in I\\
y(t_{0})=y_{0}\\ \end{array} \right.$

Definition (Solution to an ODE of first order):

Let $I \subset \mathbb{R}$ be an interval, $D \subset I×\mathbb{R}^{d}$ a set, $f:D \to \mathbb{R}^{d}$ a function and $t_{0} \in I$ an initial value with $y_{0} \in \mathbb{R}^{d}$ where $(t_{0},y_{0}) \in D$. Then

(a) a function $u:I \to \mathbb{R}^{d}$ is a solution to the ODE if

$J \subset I$ is an interval

$u$ is continuously differentiable on $J$

$(t_,u(t)) \in D$ for all $t \in J$ and

$u'(t)=f(t_,u(t))$ for all $t \in J$.

(b) a solution to the IVP if in addition

$t_{0} \in J$ and

$u(t_{0})=y_{0}$.

I will omit the part about local and global solutions and extensions of solutions for brevity.

Now I have two questions:

1) Why do we have $F:D \to \mathbb{R}^{k}$ a function in the definition of an ODE? Of course, we can have $n$ derivatives or order $n$ and we can have vector-valued ODEs, so that it makes sense to let $D \subset I×(\mathbb{R}^{d})^{n+1}$ since $F$ will depend on $n$ derivatives which are $d$-dimensional vectors. This means that $F$ actually depends on $d \times (n+1)$ component functions and $t$. But shouldn't the range of the function still be $d$-dimensional? We also don't have $k$ in the definition of IVPs corresponding to explicit ODEs. Am I missing something or is this a notational error?

2) Why do we require a solution to an ODE/IVP to be continuously differentiable? Of course, we need differentiability, but why must the derivative be continuous? Is it simply because we want to be able to use integration when solving the ODE? But then we might as well impose that requirement in the definition of ODEs itself. The other conditions are pretty clear to me. We need the solution to be in the domain of $F$ on $J$ and its derivative must satisfy the differential equation and the condition $J \subset I$ is due to the fact that a solution can "blow up in a finite amount of time".

Thanks very much!

Edit: Maybe it is useful to make question 2) more precise. Why do we assume the derivative $u'(t)$ to be continuous which then implies that $f(t,(u(t))$ is continuous instead of assuming $f(t,(u(t))$ in the definition of the ODE in the first place?

Best Answer

@1

The author seems to make the main distinction

partial vs. ordinary,

that is, among the differential equations that only combine values and derivatives at one single point (no delay-differential equations), it makes the distinction between equations that contain derivatives in the directions of several independent variables and those that only contain derivatives for one independent variable (there may be other independent variables, for the solution process those would appear as constant parameters.)

As this definition of "ordinary" appears to be intended to be as general as possible, there is no restriction on the number of equations $k$, no connection of that to the number of dependent variables $d$. It is only when you consider the possibility of solving the equation system for the highest derivatives that you will need equality of these two numbers. Even then there remains the freedom that each dependent variable has a different highest derivatives order. However, if there exists a local solution, a first order system can be established and the existence theory applies.

There might still be no solution (in the sense of the implicit function theorem), in some cases this can be solved by adding derivatives of the equations to the system until such a determination of highest order derivatives is possible. That is the contents of the DAE theory (differential-algebraic equations).


@2

All the basic theorems on the local existence of solutions to initial-value problems require the continuity of the equation, and in consequence of the solution. This is mainly due to the transformation to the Volterra (?) integral equation or the corresponding Picard fixed-point iteration, which most easily works as a self-mapping on a space of continuous functions.

In an advanced setting one also considers piecewise continuous equations, that is, systems that combine several (in themselves continuous) ODE models or vector fields for different phases of the state space. The main characteristic is that you can get contradicting vector fields at phase boundaries, where a classical solution abruptly ends and a generalized solution enters a "sliding mode". At such points the generalized solution will become non-differentiable. An example is $x'(t)=-sign(x)$, where for $x(0)=x_0>0$ one gets the generalized solution $x(t)=\max(x_0-t,0)$.

Related Question