Every now and then it's nice to nuke a mosquito.
Let's assume that the path connecting two points $(a,y(a))$ and $(b,y(b))$ can be expressed as a function, and the curve $C(x)$ is given by $C(x) = (x,y(x))$. Then we will proceed using the Calculus of Variations.
The derivative of $C$ wrt $x$ is $(1, y')$, and the functional we want to minimize is the length of the curve $L = \int \|C'\|dx = \int_a^b\sqrt{1 + y' ^2} dx$. If we take $f(x,y,y') = \sqrt{1 + y'^2}$, we get that $\frac{df}{dy} = 0, \frac{df}{dy'} = \frac{y'}{\sqrt{1 + y'^2}}$. Then the Euler-Lagrange equation, sometimes referred to as the fundamental equation of the Calculus of Variations, says exactly that $\dfrac{d}{dx} \left( \frac{y'}{\sqrt{1 + y'^2}}\right) = 0$, which is exactly that $y'$ is a constant.
Thus, if the path connecting the two points is expressible as a function, then the shortest such path is given by a straight line.
EDIT I was certain that someone was in the middle of writing an answer when I typed my tongue-in-cheek response (as so often happens), but as I now see that there is more to add, allow me to extend my answer
The problem here is that we must first define "distance." In the standard Euclidean Plane, the distance between two points is defined to be the length of the line segment between them. So we can drop the word 'shortest' and say that "The distance between any two distant points is the length of the line segment joining them."
Presumably, you want to know that going along any other path will be at least as long. One way of 'seeing this' is that you can approximate any curve with a polygonal path, and these satisfy the triangle inequality, which will make the path longer.
Here is something which I've learned in the past few months that I think answers your question, despite being a little esoteric. I'll try to keep it brief.
Start with reasonable axioms defining a plane in geometry (say, Hilbert's axioms), and to make things nice, we'll additionally require the geometry to be ordered and that it satisfy Archimedes' axiom.
It turns out that with diligence, you can construct a field $\Bbb {F}$ such that every point in the geometry can be identified with an ordered pair in $\Bbb F\times \Bbb F$, and by virtue of the way $\Bbb F$ was constructed, it can be proven that each line we started with is exactly the set of points of $\Bbb F\times \Bbb F$ satisfying some equation $ax+by+c=0$ with $a,b,c\in \Bbb F$, where at least one of $a,b$ is nonzero. (The actual construction would be fairly space consuming, and there are at least two different constructions.)
In summary, from a plane with a satisfactory geometry, one can build a field "coordinatizing" it, and it's built in such a way that the lines look like $ax+bx+c=0$ for some $a,b,c$.
For an Archimedean geometry, this field is necessarily a subfield of $\Bbb R$, and if you add yet another "completeness" type axiom to the geometry, then the field will be all of $\Bbb R$.
Understandably, we do not herd schoolchildren through all of this stuff, but we just begin with our famous field $\Bbb R$ and work the other direction :) We just say "lines look like that" and "Hey, look, this is a model of Euclidean geometry. Now go prove stuff."
If you are really interested in the details of this, you can find complete proofs in Artin's Geometric algebra and in Hilbert's Foundations of geometry. I bet it's in more modern texts too, but these are the ones I happen to know. Hilbert proves lines have the form $ax+by+c=0$ in theorem 34.
Please keep in mind, though, that the general construction deals with more than just subfields of $\Bbb R$. Some of the geometries produce finite fields, and some of the geometries produce noncommutative fields. And above all, some planar geometries are not suitable at all even to produce a division ring. (Geometries which fail Desargues's theorem or Pascal's theorem are examples of geometries which fail the construction.)
Best Answer
You asked for something that wasn't a proof or formal argument, so I hope this helps.
In any geometry, including non-Euclidean geometry (e.g. hyperbolic, or spherical geometry), "straight lines" are really called geodesics, which are defined to be the shortest line between two points. This means you stand somewhere holding one end of some rope, your friend stands somewhere else holding the other end, and together you pull the rope taut and this gives you your shortest path.
For example, say you're standing on the surface of a ball (i.e. a 2-sphere, such as the surface of the Earth), and your friend is some way away also on the surface, both of you holding the rope tight. This is spherical geometry. The taut rope or our "straight line" or geodesic is really the shortest path between us that lies on the surface, i.e. where the rope goes. This geodesic will look curved to someone in Euclidean space because there, the geodesic/"straight line" would pass through the ball.
Therefore it turns out that our definition of "straight" depends on the geometry we're using and how we pull the rope taut (the metric we use). It just so happens that in Euclidean geometry, this gives us lines that we call straight.
Interesting note: in other spaces, there are some super cool and peculiar metrics that make the taut rope (shortest path) go into weird shapes in Euclidean geometry! One example: https://en.wikipedia.org/wiki/Taxicab_geometry