Classical Mechanics – Axiomatising to Derive the Principle of Stationary Action

classical-mechanicsdefinitionlagrangian-formalismmomentumvariational-principle

$\newcommand{\d}{\mathrm{d}}\newcommand{\l}{\mathcal{L}}$Throughout all my study of physics, it has never been clear what is a definition, what is an axiom, what is a law and what is a proof in physics. There are many major results, like Newton's laws, Maxwell's equations, the focus of this question – stationary action, conservation of energy, conservation of momentum, potential energy, etc. and all my research on these concepts has lead me slightly in circles. The existence of relativity theory doesn't help, either: when Einstein wrote $E^2=m^2c^4+p^2c^2$, at the end of a presumably lengthy derivation, what definition of $p$ did he use? A photon has momentum but not mass; the top answer on this site that I saw referenced the above equation and derived that the momentum of a photon is "simply $p=E/c$", but to students without the right background, such as myself, this is very circular! Clearly the definition $p=mv$ does not suffice, but what else did Einstein take as his fundamental definition?

Onto the main question:

I will make what I consider to be the most fundamental definitions, not because I've been taught it this way but because this is what I've pieced together, and I'll try to proceed from there. My question to the users on this site is to correct me on my "fundamental" definitions, such as which definition of $p$ did Einstein use, as a formal ground-up construction of physics is not something I have ever seen, which is a shame to me as a mostly mathematical student – in my eyes, without given axioms and definitions, nothing can be shown. I have also not studied relativity and will be working in classical mechanics here, but I would like to have definitions consistent with relativity and quantum theory too. The goal is to arrive at the principle of stationary action, and yes, this has been discussed on this site before but all the answers invoked definitions I am not comfortable with, hence this question. I am confident that my outline of definitions below will be somehow in the wrong order, or wrong in some way, since the definitions are my own in the sense that they are the result of my attempt to make everything non-circular.

Axioms based on observation: every body of a physical system has some mass $m$, a resistance to motion, a position in $3D$ space, $s$, and events occur with respect to some order given by time $t$. Bodies in a system also have an energy $E$, a quantity representing their capacity to act on other bodies in the system, and bodies oppose the transfer of energy. All such properties are observed as, and are assumed to always be, mathematically continuous, differentiable and integrable quantities with respect to time. More definitions:

  1. The velocity $v$ of a body is the time derivative of its position $s$, and the acceleration $a$ is the time derivative of $v$.
  2. The momentum of the body, $p$, is the partial derivative of the body's energy $E$ with respect to the body's velocity.
  3. The net force acting on a body, $F(t)$, at some given time $t$ is the partial time derivative of $p$: $$F=\frac{\partial p}{\partial t}=\frac{\partial^2 E}{\partial tv}$$
  4. The kinetic energy, $T$, transferred over some time interval $[t_0,t_1]$ to a body is the integral of external forces with respect to the position measure: $$T=\int_{t_0}^{t_1}F(t)\cdot\d s(t)=\int_{t_0}^{t_1}F(t)\cdot v(t)\,\d t$$
  5. When a body is acted upon and it is moved along a path $L$, by an external force $F(s)$ at every $s$ along the path $L$, to it is also transferred a potential energy $U$: $$U=-\oint_LF(s)\cdot\d s$$Where the negative sign represents that body's opposition to the change in position, which exists by axiomatic assumption.

I believe these definitions make sense since if one naively takes $E=\frac{1}{2}mv^2$, then we get the familiar $p=\partial_vE=mv$, and if $m$ is constant then $F=\partial_tp=ma$ is the familiar definition of force. I am unsure if they are consistent with more general theories like relativity. It is important for me to have fundamental definitions of these quantities that are consistent with all of physics, so that I don't get confused when I study those topics later on. The energetic definitions I have seen floating around and I think what I've written is non-circular there, and correct. However, the clause "by axiomatic assumption" at the end there is dubious! Moreover, I have never seen $F=\partial_{vt}E$ written anywhere, so there is probably something awry with that definition. Another problem is that it isn't clear that the definitions of $T$ and $U$ are consistent with each other, in that it isn't clear that they both represent the energy $E$ (which I defined very vaguely…)

Note that $3$ is just Newton’s law, but I’ve always felt it is not really a law (I.e. a logical consequence of something else) but more just a definition (otherwise what is the meaning of force?) If I am mistaken in this belief please correct it!

Onto the Lagrangian mechanics:

Define $\l(s(t),s'(t))=T(s'(t))-U(s(t))$ the Lagrangian of a body. The body will take a (possibly zero) path $\gamma$ in the system as time goes on, from positions $s_0$ to $s_1$. Define $A(\gamma)=\oint_\gamma\l(t)\d t$ the action of the body. It will be shown that the path taken by the body, $\gamma$, is that for which $A(\gamma)$ has a stationary point with respect to the space of smooth paths in the system.

The usual variational calculus derivation occurs, and one arrives at:

$$\frac{\partial\l}{\partial s}=\frac{\d}{\d t}\frac{\partial\l}{\partial s'}$$

Being a necessary condition for $s$ on the path $\gamma$. From the definitions $2,3$ I used, which follow Newton, we have that:

$$\frac{\partial\l}{\partial s}=-\frac{\partial}{\partial s}U(s)=F(t),\quad\frac{\d}{\d t}\frac{\partial\l}{\partial s'}=\frac{\d}{\d t}\frac{\partial}{\partial s'}T(s'(t))=\frac{\d}{\d t}p(t)=F(t)$$

Which is consistent with definition $5$ by the fundamental theorem of calculus. "Therefore" the principle of stationary action is correct and Lagrangian mechanics is consistent with Newtonian mechanics.

Is this correct? I assume my definitions are wobbly… I'd greatly appreciate help on the fundamental construction of mechanics theory. The “therefore” is feels quite weak; is it an actual conclusion?

Best Answer

Others have gotten into the weeds, so I'll step back a bit. Sorry to break out a mix of maths, philosophy & physics here, but:

it has never been clear what is a definition, what is an axiom, what is a law and what is a proof

Even in pure mathematics, attempting an axiom/definition distinction overlooks the role of axioms as implicit definitions. The axioms of Euclidean geometry, PA and ZFC respectively discuss "points", "natural numbers" and "sets". They don't explicitly define these, but they characterize them by the axioms they satisfy, to the point their names are only of historical relevance, in that these axioms attempt to capture older intuitive notions tied to natural languages' words. Hilbert made a famous comment about this.

A law is presumably a theorem, or stylized variant thereof, satisfying additional criteria I won't try to summarize. It's certainly not a matter of being fundamental, at least not long after a law is named. As for proofs, the real challenge is in choosing what to assume, not in identifying what was assumed or how it has specific consequences. In the empirical sciences, we have an "it works" criterion; the closest parallel in mathematics is consistency, which is much less selective. That's not to say other criteria aren't used for further selection, though.

all my research on these concepts has lead [sic] me slightly in circles.

There are many equivalent formulations of (for example) mechanics that have different domains of easiest usage, which is why they're all worth learning. If you ask "which is fundamental?", none are. One might be the oldest, but "they're all right, because they're all one theory, which is right" is all the justification physics needs. Why are they right? Mathematics can't prove they are, but evidence kinda sorta can (see also Sec. 7.1 here).

I will make what I consider to be the most fundamental definitions... correct me on my "fundamental" definitions... a formal ground-up construction of physics is not something I have ever seen

For the above reasons, this may be the wrong aim.

without given axioms and definitions, nothing can be shown

Not a priori, no. But that's not how science works. Philosophical subtleties aside, it at least occasionally glances at the world to discover its contingent truths.

The goal is to arrive at the principle of stationary action

I realize you have specific goals I've so far overlooked. If you want something deeper that that principle, this will interest you. The basic idea is quantum amplitudes interfere constructively and destructively, and the mean effect is... basically what the classical version of the aforementioned principle says.

they are the result of my attempt to make everything non-circular

One delicious consequence of empirical knowledge is you don't need to worry about whether you're circular . A "concise" characterization of your theory that doesn't repeat itself the way a circular theory might doesn't have any predictive advantages over one that's open to such an accusation. We know we're (probably approximately) correct because the world tells us so.

Axioms based on observation: every body of a physical system has some mass $m$, a resistance to motion, a position in $3D$ space, $s$, and events occur with respect to some order given by time $t$.

It's one thing to say observation warrants such claims; it's another to take them as axioms, or as unique axioms. Ultimately the entire theory is equally corroborated by the evidence, as a whole; data doesn't say which parts to treat as axioms. The role of proofs (putting aside for the moment how deep we have to go to hit "axioms", whose choice may not be unique) is to help us organize the explanation of many observations as the consequences of a few ideas so that (i) if we discover we're wrong (as sometimes happens!) we have a shortlist of what might "have to give and (ii) motivate unifying efforts so we're not just stamp-collecting.

It is important for me to have fundamental definitions of these quantities that are consistent with all of physics, so that I don't get confused when I study those topics later on.

Sadly, we sometimes revise attempts at such axioms when "all" of physics expands. If it works, it works. My main advice for you, however, is to focus on Lagrangian formulations, if only because they've typically had the least trouble adapting in this manner. For example, Lagrangian mechanics accommodates relativity by becoming a field theory, which accommodates quantum effects with operators. Perhaps the biggest upset this will cause your apple cart is the need to focus on canonical, not kinetic, momenta.

the clause "by axiomatic assumption" at the end there is dubious

Why? An axiom can assume whatever it wants. As long as the final theory is neither inconsistent nor at odds with observation, you're fine.

it isn't clear that the definitions of $T$ and $U$ are consistent with each other, in that it isn't clear that they both represent the energy $E$ (which I defined very vaguely...)

If by vaguely you mean implicitly, sure, but that's fine. And if two quantities aren't "clearly" equal, an axiom can say they are, and hopefully that's never false. As I said, I'm leaving other answers to comment on the general validity of your axioms.

it is not really a law (I.e. [sic] a logical consequence of something else) but more just a definition (otherwise what is the meaning of force?)

Oh, you're definitely wrong about that.