Calculus – Justification of Algebraic Manipulation of Infinitesimals

calculusinfinitesimals

As an engineering student, I regularly see people making arguments like this:

Consider a rectangle of dimensions $x\times 4x$. If we make $x$ bigger by a small quantity $dx$ then this will make $4x$ bigger by $4\cdot dx$ so the area of that $x \times 4x$ rectangle will change from $4x^2$ to $$(x+dx)(4x+4dx)=4(x^2+2x\cdot dx+(dx)^2)\approx4x^2+8x\cdot dx$$
with the final step justified because $dx$ is a 'small' quantity so $(dx)^2$ will be so small as to be ignorable in some mathematically rigorous way. Thus the change in area $dA$ would be $8x\cdot dx$.

Arguments like this are very common. Another random example would be in Wikipedia's proof of the brachistochrone problem which starts with the statement

$$ds^2=dx^2+dy^2$$

and proceeds to manipulate these infinitesimals as if they were ordinary constants or variables.

I'm wondering if there's a simple, analytically rigorous justification for all of this manipulation. While I feel perfectly comfortable with the idea of the derivative of a function (considered as a limit), I've never seen a similar, rigorous justification for the algebraic manipulation of infinitesimals and the cancellation of 'small' terms (like $(dx)^2$). Any thoughts or help would be appreciated.

Thankyou

Best Answer

I will just throw a few buzzwords at you :-)

The mathematically precise concept of "infinitesimal" is called "differential form". If we fix the Euclidean plane $\mathbb{R}^2$ and think of it as a "differentiable manifold", then every point in the plane has a tangential space that is "isomorph to" (or another copy of) $\mathbb{R}^2$. If we further fix a cartesian coordinate system with coordinates x and y, then a differential form is a gadget that assigns for every point p with coordinates $(x_p, y_p)$ to a tangent vector at that point a real number.

If you think of attaching a vector pointing upwards with the length of on, (0, 1), at every point in the plane, then in our example dx would spit out 0 at every point and dy would spit out 1 at every point.

This is the starting point for modern abstract "coordinate free" differential geometry.

From my experience, these concepts are usually not easy to understand for beginners, so don't worry if you don't understand everything on a first reading.

First note that you don't need the concept of a "differential form" to understand your first example:

Take a rectangle with side lengths a and b, then we have a function that gives the area, $$ f: \mathbb{R}^2 \to \mathbb{R} $$ $$ f: (a, b) \mapsto ab $$ If you increase both of the coordinates by h, then this is just a directional derivative, and since $f$ is differentiable we know that $$ f(a + h, b + h) - f(a, b) = df*(h, h) + o(h^2) $$ holds. Here "dx" is just short hand notation for both "h" and "we know that f is differentiable and therefore that the remainder of the right hand side is $o(h^2)$, that is for smaller and smaller h the linear approximation gets better and better". The linear approximation is by definition given by applying the differential $df$ of $f$ to the vector $(h, h)$.

The second example is a little bit more complicated, here we'll really need the concept of "differential forms". To be mathematically precise, we'd have to write $$ ds^2 := dx \otimes dx + dy \otimes dy $$ That is, the left hand side is defined by the right hand side, and the right hand side consists for a fixed point of elements of the tensor product of the cotangential space $T^*_pM$ with itself.

This means this gadget eats two tangential vectors on any tangent space and spits out a real number, and this operation is bilinear (linear in both input variables). So, if you fix a point on the plane, you get an element of the space $$ T^*_pM \otimes T^*_pM $$ which has an algebraic structure that can be used.

If you would like to learn more about this, I'd recommend any textbook on differential geometry or differentiable manifolds.

Related Question