Every now and then it's nice to nuke a mosquito.
Let's assume that the path connecting two points $(a,y(a))$ and $(b,y(b))$ can be expressed as a function, and the curve $C(x)$ is given by $C(x) = (x,y(x))$. Then we will proceed using the Calculus of Variations.
The derivative of $C$ wrt $x$ is $(1, y')$, and the functional we want to minimize is the length of the curve $L = \int \|C'\|dx = \int_a^b\sqrt{1 + y' ^2} dx$. If we take $f(x,y,y') = \sqrt{1 + y'^2}$, we get that $\frac{df}{dy} = 0, \frac{df}{dy'} = \frac{y'}{\sqrt{1 + y'^2}}$. Then the Euler-Lagrange equation, sometimes referred to as the fundamental equation of the Calculus of Variations, says exactly that $\dfrac{d}{dx} \left( \frac{y'}{\sqrt{1 + y'^2}}\right) = 0$, which is exactly that $y'$ is a constant.
Thus, if the path connecting the two points is expressible as a function, then the shortest such path is given by a straight line.
EDIT I was certain that someone was in the middle of writing an answer when I typed my tongue-in-cheek response (as so often happens), but as I now see that there is more to add, allow me to extend my answer
The problem here is that we must first define "distance." In the standard Euclidean Plane, the distance between two points is defined to be the length of the line segment between them. So we can drop the word 'shortest' and say that "The distance between any two distant points is the length of the line segment joining them."
Presumably, you want to know that going along any other path will be at least as long. One way of 'seeing this' is that you can approximate any curve with a polygonal path, and these satisfy the triangle inequality, which will make the path longer.
If you have a curve $y=f(x)$, the length of the curve between points $(x_1,f(x_1))$ and $(x_2,f(x_2))$ is given by
$$L(x_1,x_2) = \int_{x_1}^{x_2} \sqrt{(dx)^2 + (df(x))^2} = \int_{x_1}^{x_2}\sqrt{1+\left(\dfrac{df}{dx} \right)^2} dx$$
In your case, $f(x) = x^2$. Hence, $\dfrac{df}{dx} = 2x$. Hence, the length of curve between the points $(x_1,x_1^2)$ and $(x_2,x_2^2)$ is
$$L(x_1,x_2) = \int_{x_1}^{x_2} \sqrt{1+(2x)^2} dx = \int_{x_1}^{x_2} \sqrt{1+4x^2} dx$$
Can you evaluate this integral?
$$\int_{x_1}^{x_2} \sqrt{1+4x^2} dx = \left.\dfrac{2x \sqrt{4x^2+1} + \log(2x+\sqrt{1+4x^2})}4 \right \vert_{x_1}^{x_2}$$
EDIT
Added picture to highlight the different lengths.
Best Answer
If you are performing k-means clustering, I would suggest doing some kind of normalization. Consider a more extreme case, where $y$ ranges from $0$ to $1$ and $x$ ranges from $0$ to $1000000$. All that will end up mattering, if you use Euclidean distance, is the $x$-values. The chances of a $y$-value having a significant effect on one point's distance from another is miniscule. So, mathematically, your $y$-variable is considered irrelevant.
Do you want your $y$-variable to be irrelevant? Is your $x$-variable roughly 10 times as important as your $y$? If not, then you should normalize.