One way to understand how the test works is by looking at the Taylor Series of the function $f(x)$ centered around the critical point, $x = c$:
$$
f(x) = f(c) + f'(c)(x-c) + \frac{f''(c)}{2}(x-c)^2 + \cdots
$$
Note: In your question you said that the n-th derivative is non-zero. Here I'm assuming the n+1-st derivative is the first to be non-zero at $x=c$. It doesn't make a difference, it's just the way I learned it.
If $f'(c) = \cdots = f^{(n)}(c) = 0$ and $f^{(n+1)} \ne 0$, then the Taylor Series ends up looking like this:
$$
f(x) = f(c) + \frac{f^{(n+1)}(c)}{(n+1)!}(x-c)^{n+1} + \frac{f^{(n+2)}(c)}{(n+2)!}(x-c)^{n+2} + \cdots
$$
Consider what happens when you move $f(c)$ to the other side of the equation:
$$
f(x) - f(c) = \frac{f^{(n+1)}(c)}{(n+1)!}(x-c)^{n+1} + \frac{f^{(n+2)}(c)}{(n+2)!}(x-c)^{n+2} + \cdots
$$
What does $f(x) - f(c)$ mean?
- If $f(x) - f(c) = 0$, then $f(x)$ has the same value as it does at $x = c$.
- If $f(x) - f(c) < 0$, then $f(x)$ has a value less than it has at $x = c$.
- If $f(x) - f(c) > 0$, then $f(x)$ has a value greater than it has at $x = c$.
We expect $f(x) - f(c) = 0$ at $x = c$ (the equation reflects this), but we're more interested in what it does on either side of $x = c$. When $x$ is really close to $c$, i.e. $(x-c)$ is a really small number, we can say:
$$
f(x) - f(c) \approx \frac{f^{(n+1)}(c)}{(n+1)!}(x-c)^{n+1}
$$
because the higher powers of a small number "don't matter" as much.
Concerning local extrema
If $n$ is odd, then our approximation of $f(x) - f(c)$ is an even-power polynomial. That means $f(x)$ has the same behavior - is either less than or greater than $f(c)$ - on both sides of $x = c$. Therefore it's a local extreme. If $f^{(n+1)}(c) > 0$, then $f(x)$ is greater than $f(c)$ on both sides of $x = c$. Otherwise, if $f^{(n+1)}(c) < 0$, then $f(x)$ is less than $f(c)$ on both sides of $x = c$
If, on the other hand, $n$ is even, then our approximation of $f(x) - f(c)$ is an odd-power polynomial centered around $x = c$. Therefore $f(x)$ will be greater than $f(c)$ on one side of $x = c$, and less on the other. That means $x = c$ isn't a local extreme.
Concerning saddle points
Note that if you differentiate both sides of our approximation twice, you get:
$$
f''(x) \approx \frac{f^{(n+1)}(c)}{(n-1)!}(x-c)^{n-1}
$$
If $n$ is even, this is another odd-power polynomial centered around $x = c$. It therefore has opposite behavior on each side of $x = c$, giving you a saddle point.
Derivatives at a point are numbers (and these numbers are calculated as limits of a certain quotient), and if for each point you assign a number which is the derivative at that point, then you of course get a function $\Bbb{R}\to \Bbb{R}$. Leibniz's notation is confusing because it doesn't tell you where the derivatives are being evaluated, hence blurs the distinction between functions vs function values. (it may not seem like such a big deal especially when doing simple problems, but I guarantee that it will quickly get very confusing in multivariable calculus if all these basic concepts aren't kept straight).
Writing the chain rule as $\dfrac{dy}{dx} = \dfrac{dy}{du} \dfrac{du}{dx}$ is inaccurate for several reasons:
- It introduces completely irrelevant letters in the denominator (an unfixable flaw with Leibniz's notation)
- Doesn't tell you where the derivatives (which are functions as I explained in my previous paragraph) are being evaluated (you can try to make this more precise, but then you lose the "simplicity" of Leibniz's notation).
- The $y$ on the LHS has a completely different meaning from the $y$ on the RHS (this wouldn't be a huge deal if there was no chance of confusion... but unfortunately it causes a lot of confusion especially in several variables; see link below)
The third is I think the biggest problem, and I'll try to explain that now. In Lagrange's notation, the chain rule is expressed as $(y\circ u)'(x) = y'(u(x)) \cdot u'(x)$, or if you want to write a proper equality of functions, it is just $(y\circ u)' = (y'\circ u)\cdot u'$. So, there are actually three functions involved: there is $y$, there is $u$ and there is the composition $y\circ u$. The chain rule tells us how the derivatives of these three functions are related.
However, when you write $\dfrac{dy}{dx} = \dfrac{dy}{du}\cdot \dfrac{du}{dx}$, it gives the incorrect impression that there are only two functions, $y$ and $u$. Well, now you could argue that on the LHS we should "consider $y$ as a function of $x$" while on the RHS "$y$ is a function of $u$" so these are different things. This is of course right, the two things are very different, but this is all covered up in the notation. A perhaps slightly better way of writing it would be $\dfrac{d(y\circ u)}{dx} = \dfrac{dy}{du} \cdot \dfrac{du}{dx}$. But this is also not quite correct. Basically, any attempt to write the chain rule down formally is a huge nightmare. The best I can do is say that for every $x\in \text{domain}(u)$,
\begin{align}
\dfrac{d(y\circ u)}{dx}\bigg|_x &= \dfrac{dy}{du}\bigg|_{u(x)}\cdot \dfrac{du}{dx}\bigg|_x
\end{align}
This fixes issues $(2)$ and $(3)$ mentioned above to an extent, but $(1)$ still remains an issue.
You said in the comments that
I don't see much of a problem with $y$ depending on both $u$ and $x$, given that $u$ and $x$ are also related.
Well, if originally $y$ "depends on $u$", how can it all of a sudden "depend on $x$"? Of course, I know what you mean, but the proper way to indicate this dependence is not to say that "$y$ depends on $x$", but rather that the composite function $y\circ u$ depends on $x$. Here, you might think that this is just me being pedantic with language; and you're right. However, the reason I'm pedantic is because that poor language and notation leads to conceptual misconceptions; this has been both my experience when studying and also based on what I've observed from some questions on this site. For example, in this question, the OP finds that $\frac{\partial F}{\partial y} = 0$ and $\frac{\partial F}{\partial y} = -1$. The reason for this apparent contradiction is that the two $F$'s are actually completely different things (I also recall a question in the single variable context, but I can't seem to find it).
Regarding your other question
If I ask what is the derivative of $f(x)$ with respect to $\frac{x}{2}$, does this question make sense? Is it simply $f'(\frac{x}{2})$? Or do we have to express $x^2$ in terms of $\frac{x}{2}$? And how can we can express this derivative using Lagrange's notation?
The answers in succession are "one could make sense of this question", "no", and "yes". Let me elaborate. So, here, we're assuming that $f:\Bbb{R}\to \Bbb{R}$ is given as $f(x) = x^2$. To make precise the notion of "differentiating with respect to $\frac{x}{2}$", one has to introduce a new function, $\phi:\Bbb{R}\to \Bbb{R}$, $\phi(t) = 2t$. Then, what you're really asking is what is the derivative of $f\circ \phi$? To see why this is the proper way of formalizing your question, note that
\begin{align}
f(x) &= x^2 = \left(2 \cdot \dfrac{x}{2}\right)^2 = 4 \left(\frac{x}{2}\right)^2
\end{align}
and that $(f\circ \phi)(t) = f(2t) = (2t)^2 = 4t^2$. So this is indeed what we want.
And in this case,
\begin{align}
(f\circ \phi)'(t) &= f'(\phi(t)) \cdot \phi'(t) \\
&= [2 \cdot \phi(t)] \cdot [2] \\
&= [2\cdot 2t] \cdot 2 \\
&= 8t
\end{align}
Notice how this is completely different from $f'\left(\frac{x}{2}\right) = 2 \cdot \frac{x}{2} = x$.
In general, when you have "___ as a function of $\ddot{\smile}$ " and you instead want to "think of ___ as a function of @", what is going on is that you have to use an extra composition. So, you need to have three sets $X,Y,Z$, a given function $f:Y\to Z$ (i.e we think of elements $z\in Z$ as "functions of" $y\in Y$) and if you now want to think of "z as a function of $x$", then what it means is that you somehow need to get a mapping $X\to Z$ which involves $f$ somehow. In other words, we need a certain mapping $\phi:X \to Y$ and then consider the composition $f\circ \phi$ (see for example the remarks towards the end of this answer).
Things can be slightly confusing when all the sets are the same $X=Y=Z = \Bbb{R}$, but in this case you should think of the three $\Bbb{R}$'s as "different copies" of the real line, and that each function maps you from one copy of the real line to another copy of the real line.
Edit:
Here's a passage from Spivak's Calculus text (Chapter 10, Question 33), where I first learnt about the double usage of the same letter.
Best Answer
Short answer: Yes.
The name "fractional calculus" is an unfortunate one, because it suggests that the theory only handles rational orders. But, as Iblis mentions in the comments, the theory also deals with irrational and even complex orders.