Could someone please convince me that there is something natural about
the choice of the Lagrangian formulation...
If I ask a high school physics student, "I am swinging a ball on a string around my head in a circle. The string is cut. Which way does the ball go?", they will probably tell me that the ball goes straight out - along the direction the string was pointing when it was cut. This is not right; the ball actually goes along a tangent to the circle, not a radius. But the beginning student will probably think this is not natural. How do they lose this instinct? Probably not by one super-awesome explanation. Instead, it's by analyzing more problems, seeing the principles applied in new situations, learning to apply those principles themselves, and gradually, over the course of months or years, building what an undergraduate student considers to be common intuition.
So my guess is no, no one can convince you that the Lagrangian formulation is natural. You will be convinced of that as you continue to study more physics, and if you expect to be convinced of it all at once, you are going to be disappointed. It is enough for now that you understand what you've been taught, and it's good that you're thinking about it. But I doubt anyone can quickly change your mind. You'll have to change it for yourself over time.
That being said, I think the most intuitive way to approach action principles is through the principle of least (i.e. stationary) time in optics. Try Feynman's QED, which gives a good reason to believe that the principle of stationary time is quite natural.
You can go further mathematically by learning the path integral formulation of nonrelativistic quantum mechanics and seeing how it leads to high probability for paths of stationary action.
More importantly, just use Lagrangian mechanics as much as possible, and not just finding equations of motion for twenty different systems. Use it to do interesting things. Learn how to see the relationship between symmetries and conservation laws in the Lagrangian approach. Learn about relativity. Learn how to derive electromagnetism from an action principle - first by studying the Lagrangian for a particle in an electromagnetic field, then by studying the electromagnetic field itself as described by a Lagrange density. Try to explain it to someone - their questions will sharpen your understanding. Check out Leonard Susskind's lectures on YouTube (series 1 and 3 especially). They are the most intuitive source I know for this material.
Read some of the many questions here in the Lagrangian or Noether tags. See if you can figure out their answers, then read the answers people have provided to compare.
If you thought that the Lagrangian approach was wrong, then you might want someone to convince you otherwise. But if you just don't feel comfortable with it yet, you'd be robbing yourself of a great pleasure by not taking the time to learn its intricacies.
Finally, your question is very similar to this one, so check out the answers there as well.
Perhaps a simple example is in order. Consider a simple harmonic oscillator (SHO)
$$ S~=~\int_{t_i}^{t_f} \! dt~L, \qquad
L~=~\frac{m}{2}\dot{x}^2 - \frac{k}{2}x^2, \tag{1}$$
with characteristic frequency
$$ \frac{2\pi}{T}~=~\omega~=~\sqrt{\frac{k}{m}}, \tag{2}$$
and Dirichlet boundary conditions
$$ x(t_i)~=~x_i \quad\text{and}\quad x(t_f)~=~x_f. \tag{3}$$
It can be shown that the classical path is only a minimum for the action (1) if the time period
$$ \Delta t~:=~t_f-t_i~\leq~ \frac{T}{2}\tag{4}$$
is smaller than a characteristic time scale $\frac{T}{2}$ of the problem. (If $\Delta t=\frac{T}{2}$ there is a zeromode.) For $\Delta t>\frac{T}{2}$ the classical path is no longer a minimum for the action (1), but only a saddle point. If we consider bigger and bigger $\Delta t$, a new negative mode/direction develops/appears each time $\Delta t$ crosses a multiple of $\frac{T}{2}$.
It is such examples that Ref. 1. has in mind when saying that the principle of least action is actually a principle of stationary action. The above phenomenon is quite general, and related to conjugated points/turning points and Morse theory. In semiclassical expansion of quantum mechanics, this behaviour affects the metaplectic correction/Maslov index. See e.g. Ref. 2 for further details.
A similar phenomenon takes place in geometrical optics, where it is straightforward to construct examples of light paths that do not minimize the time, cf. Fermat's principle of least time.
References:
Landau and Lifshitz, Vol.2, The Classical Theory of Fields, p. 24.
W. Dittrich and M. Reuter, Classical and Quantum Dynamics, 1992, Chapter 3.
Best Answer
If we solve equations of motion for a particle with mass $m=1$ in some potential, e.g. $U=x^4-4x^3+4.5x^2$, fixing two points like $x(0)=0$ and $x(1)=2.651$, we'll get infinite number of solutions, here're some of them:
They differ by initial velocity. Now each of them satisfies equations of motion, but only one makes the action minimal, here's how $S$ depends on $v_0$ for such paths in this example:
As all they satisfy equations of motion, they make the action stationary, and it'd be wrong to just throw them away because they don't minimize the action.