[Math] Due to numerical inaccuracy, the solution of a boundary value problems becomes negative

boundary value problemMATLABnumerical methodsordinary differential equationssystems of equations

I treat a toy example to get my point across. In reality I have to deal with a much more complex model.

Let us consider a one dimensional boundary value problem using the bvp5c solver in Matlab. Two liquids enter at different points, move in opposite directions and react with each other. Liquid 1 enters at $x = 0$ and moves in the positive direction. Liquid 2 enters at $x = 1$ and moves in the negative direction. We are interested in the equilibrium distribution of the two concentrations over the domain $[0,1]$.

Let $c_1(x)$ be the concentration of liquid 1 at the point $x$ and similarly for liquid two. The differential equations that need to be solved are the following

\begin{align}
\frac{dc_1(x)}{dx} &= -(c_1(x) + c_2(x)), \\
\frac{dc_2(x)}{dx} &= K c_1(x) c_2(x),
\end{align}

with $K$ a constant that we can choose. Note that the $c_1(x)$ goes down for increasing $x$ and $c_2(x)$ goes down for decreasing $x$, which is exactly as we expect. The two boundary values are $c_1(0) = 10$ and $c_2(1) = 6$.

In Matlab, one models this as a boundary value problem with two boundaries and two components. By solving this problem for $K = 1$ we obtain a nice solution shown in the following figure.

nice example

However, by increasing $K$ to, say, $K = 10^5$ we run into a numerical inaccuracy, see this figure. I also receive a warning message from Matlab, see the bottom of this post.

problematic example

The concentration $c_2(x)$ drops below zero, while that should not be possible according to the second differential equation.

My intuition is that this happens because Matlab takes discrete sizes of $dx$ and computes the derivative in a point (which is large) and updates the solution, making the concentration negative. One could combat this phenomenon by increasing the maximum allowed mesh size, but this is not a solution that can be incorporated in my actual model due to the computation time and memory issues.

How to make sure that both concentrations are nonnegative?


Matlab code

The script that runs the bvp5c solver

x = linspace(0,1,100); % the mesh
yinit = [ 5 0 ]; % an initial guess for the solution for both liquids
K = 1; % the factor K in the second differential equation

solinit = bvpinit(x,yinit); % setting the mesh and the initial guess 

% Calling the bvp solver. First argument are the ODEs, the second argument 
% are the boundary conditions at x = 0 (for liquid 1) and at x = 1 (for
% liquid 2), the last argument specifies the mesh and initial guess
sol = bvp5c(@(x,c)Concentrations(x,c,K),@(ca,cb)BoundaryCond(ca,cb),solinit);

The function with the ODEs. The dependence on $x$ is implied in the input of the function.

function dcdx = Concentrations(x,c,K)

dcdx(1) = -(c(1) + c(2)); % differential equation for liquid 1
dcdx(2) = K*c(1)*c(2); % differential equation for liquid 2

The function with boundary values. When there are two boundary values, Matlab refers to them as $a$ and $b$, which are $x = 0$ and $x = 1$, respectively.

function res = BoundaryCond(ca,cb)

res = [ ca(1)-10; cb(2)-6 ]; % the concentration of liquid 1 at boundary a
is 10 and the concentration of liquid 2 at boundary b is 6.

Error message for $K = 10^5$

Warning: Unable to meet the tolerance without using more than 5000 mesh points.
 The last mesh of 892 points and the solution are available in the output argument.
 The maximum error is 536.678, while requested accuracy is 0.001. 
> In bvp5c at 339

Best Answer

The solver uses a collocation method, that is, at each interval of the mesh it discretizes the differential equation by approximating the solution by a polynomial of a fixed degree and adding the necessary number of equations to equal the number of free parameters by demanding exact equality of the ODE at some interior points of the interval. For the given polynomial ODE this results in algebraic equations for the coefficients of the polynomial. The solver then proceeds to find a numerical solution of the BVP by solving this large non-linear system. After that, the error between the mesh points is estimated and the mesh refined towards a more homogeneous error estimate.

The point now is that the discretized differential equation will have more than one solution. Typically, the other solutions are not smooth, have large jumps between them, so that they are easily detectable by the solver. Or more often, the initial approximation is much closer to the smooth solution than to any of the extraneous solutions.

However, for stiff systems that might no longer be the case. "stiff" means that you have fast and slow components. The fast components require a very small step size to be modeled correctly in the discretization, while the many steps that that requires might lead to an undue accumulation of floating point errors in the other components. So it requires extra effort to find a workable balance.

In the collocation model, the large derivatives of the fast components lead to the situation where even the sampled exact solution is visually not smooth. That is, even some of the normal steps of the exact solution look like big jumps. So the true numerical solution is no longer easily distinguishable from the extraneous jumpy solutions of the algebraic system.


In the given system, the exact solution for large $K$ jumps at $x=1$ from essentially $c_2(x)\approx 0$ to $c_2(1)$ in the last part of the interval of length about $\Delta t=1/\sqrt{K}$. The square root is arbitrary for $1\gg\Delta t\gg1/K$, but captures the essence of the situation.

Close to $x=1$ (there can be no boundary layers at other points of the interval) the slow component $c_1$ can be treated as a constant, so that $c_2$ behaves qualitatively like an exponential function $$ c_2(x)=c_2(1)\,e^{K\,c_1(1)\,(x-1)}=c_2(1)\,e^{-K\,c_1(1)\,(1-x)} $$ This situation is captured by $e^{-Kt}$ close to $t=0$, i.e., using the substitution $t=c_1(1)(1-x)$. For $K$ large the function $e^{-Kt}$ goes from the appreciable value $1$ at $t=0$ to the very small value $e^{-\sqrt{K}}$ only a very small distance away at $t=1/\sqrt{K}$. Even more, if one wants the steps in the function values of $e^{-K\cdot n\,dt}$ to be small, one would need $K\,dt$ to be small, $1\gg K\,dt\gg 1/K$, for instance by choosing $dt=K^{-3/2}$.


The mesh refinement algorithm would have to detect this situation. However, without telling the algorithm about this situation (for instance by structuring the initial mesh this way), it would have to create a sizable number, up to $O(K)$, of mesh points with a distance of $K^{-1-\epsilon}$ close to $x=1$ to zero-in to the singularity. For $K=10^5$ this would requite mesh points in the order of some thousands or ten thousands, which will likely exceed any moderate size of Nmax.

Related Question