Is it possible to turn this ‘proof’ of the product rule into a rigorous argument

calculusderivativesintuitionproof-writingsubstitution

I have often found linear approximation to be useful in understanding the main theorems of calculus. I tried using it to 'prove' the product rule, as I find the typical proof for it to be unintuitive. However, I'm not sure that the substitution I made can be properly justified:
$$
(f \cdot g)'(a) = \lim_{h \to 0} \frac{f(a+h)g(a+h)-f(a)g(a)}{h}
$$

Here is where I use my questionable substitution: replace $f(a+h)$ with $f(a)+f'(a)h$; make a similiar substitution for $g(a+h)$. As $h$ approaches $0$, the linear approximation becomes better and better. $(f \cdot g)'(a)$ becomes
\begin{align}
\lim_{h \to 0} \frac{\bigl(f(a)+f'(a)h\bigr)\bigl(g(a)+g'(a)h\bigr)-f(a)g(a)}{h} &= \lim_{h \to 0} \frac{f(a)g'(a)h+g(a)f'(a)h+f'(a)g'(x)h^2}{h} \\
&= \lim_{h \to 0} f(a)g'(a)+g(a)f'(a)+f'(a)g'(x)h \\
&= f(a)g'(a)+g(a)f'(a)
\end{align}

There were many things about my 'proof' that pleased me. For instance, it lines up very closely with the common visual explanation of the product rule:
Visualisation of the product rule

(This image is taken from 3Blue1Brown's video on visualising the chain and product rule. Check it out.)

However, I'm still unsure about my substitution. I've heard people use similar arguments to this, e.g.
$$
\lim_{x \to 0}\frac{\sin x + \tan x}{\sin x}=\lim_{x \to 0}\frac{x+x}{x}=2
$$

because $\sin$ and $\tan$ are 'locally linear', but I am yet to see a formal justification for this kind of substitution.

Best Answer

Short answer: big-& little-O notation.

Your strategy can work by stating that, for small nonzero $h$, $f(a+h)\in f(a)+hf^\prime(a)+o(h)$ etc. What's more, the $h$ coefficient is unique; this can be taken as a definition of the derivative, equivalent to the usual one. Since the product of two $O(h)$ terms is $O(h^2)$, it's $o(h)$. So$$\begin{align}f(a+h)g(a+h)&\in(f(a)+hf^\prime(a)+o(h))(g(a)+hg^\prime(a)+o(h))\\&\subseteq f(a)g(a)+h[f(a)g^\prime(a)+f^\prime(a)g(a)]+o(h).\end{align}$$Then we just read off the $h$ coefficient.

Related Question