Misunderstanding properties of principal axes for moment of inertia

moment of inertiareference framesrigid-body-dynamicsrotational-dynamics

My lecturer has stated that the principal axes of the moment of inertia (hereafter MOI) are a set of axes such that the off-diagonal deviation terms of the MOI tensor disappear. He then said that in these axes, the mass distribution on either side of each coordinate plane will be the same.

I'm misunderstanding under what circumstances the latter condition holds. As the MOI tensor is always real symmetric, I feel that I should ALWAYS be able to find these axes irrespective of whether the origin of the coordinate system is taken to be at the COM or at some other point in the mass distribution. This is because Hermitian matrices are always unitarily diagonalizable. However, I can think of a number of examples in which I can choose the origin to be very close to the edge of an object where it is impossible to construct an orthonormal coordinate system in which the mass distribution is the same on either side, the most trivial case being a uniform sphere. I conclude that I can always choose my axes such that the off-diagonal terms disappear for any origin, but this does not necessarily make the mass distribution symmetric across the coordinate planes.

This has led me to believe that the latter condition only holds at the centre of mass (hereafter COM). However, this also has its problems as by definition of the COM, there should not exist any asymmetry of the mass distribution over any plane going through it, as this would mean that the point in question is not the COM. It follows from this that EVERY orthonormal set centred at the COM should satisfy the definition of the principal axes.

I've obviously made a mistake in my reasoning because this means that the notion of 'principal axes' is either impossible in the first case, or entirely redundant in the second. Any clarification would be greatly appreciated.

Best Answer

Your instincts are good, but in the end, the argument does not hold.

Indeed the MMOI tensor is always diagonalizable when expressed in its pure form at the COM, but it is also diagonalizable at any other point.

The purpose of the MMOI tensor is to generate rotational momentum from rotational velocity. I am going to show this process for a rigid body.

I prefer the monikers translational/rotational to linear/angular and so I am going to refer to translational and rotational momentum from now on instead of linear momentum and angular momentum. This will be consistent between motion and momentum.

  1. Problem Description

    Consider the body below, whose motion is described by the rotational velocity vector $\vec{\omega}$ and the translational velocity vector $\vec{v}_A$ at some arbitrary point A. This point is the point of summation where quantities like MMOI will be evaluated about. We place a coordinate system on this point to designate some basis vectors ( directions ) for the vector components.

    fig1

  2. Kinematics tells us that the translational velocity of the center of mass point C is

    $$ \vec{v}_C = \vec{v}_A + \vec{\omega} \times \vec{c} \tag{1}$$

    where $\vec{c}$ is the location of the center of mass (COM) measured from the coordinate system at A.

  3. Linear momentum of the body is evaluated by summing the individual momentum of each particle on the body. The result depends on the motion of the center of mass only.

    $$ \vec{p} = m \, \vec{v}_C = m\vec{v}_{A}-m\vec{c}\times\vec{\omega} \tag{2}$$

    which as you can see depends on both the translational velocity of point A and the rotational velocity of the body $\vec{\omega}$.

  4. Rotational momentum about A is evaluated by summing up the moment of momentum for each particle on the body. The result introduces us to the concept of the mass moment of inertia tensor ${\rm I}_C$ as evaluated about the center of mass

    $$ \vec{L}_A = {\rm I}_C \vec{\omega} + \vec{c} \times \vec{p} \tag{3} $$

    The 3×3 MMOI tensor ${\rm I}_C$ can be diagonalized with the correct choice of basis vectors (directions) used to describe our vectors in the first step.

    Now since translational momentum $\vec{p}$ depends on both translational and rotational motion, we can group the rotational parts together to define a new MMOI tensor, ${\rm I}_A$, about point A

    $$\vec{L}_{A}=\underbrace{{\rm I}_{C}\vec{\omega}-m\vec{c}\times\left(\vec{c}\times\vec{\omega}\right)}_{{\rm I}_{A}\vec{\omega}}+\vec{c}\times m\vec{v}_{A} \tag{4}$$

  5. The definition of ${\rm I}_A$ depends on the vector form of the parallel axis theorem and it takes one of the two following forms

    $${\rm I}_{A}={\rm I}_{C}-m[\vec{c}\times][\vec{c}\times]={\rm I}_{C}+m\left(\left(\vec{c}\cdot\vec{c}\right){\rm 1}-\vec{c}\odot\vec{c}\right) \tag{5}$$

    where $[\vec{c}\times]$ is a 3×3 skew symmetric matrix representing the cross product operation, or $\cdot$ the dot product, $\rm 1$ the 3×3 identity matrix and $\odot$ the vector outer product.

    The additional components are always symmetric and equal to the following when working out the math for $\vec{c}=\pmatrix{x & y & z}$

    $$ -[\vec{c}\times][\vec{c}\times] = \left(\vec{c}\cdot\vec{c}\right){\rm 1}-\vec{c}\odot\vec{c} = \begin{bmatrix} y^2+z^2 & -x y & -x z \\ -x y & x^2+z^2 & -y z \\ -x z & -y z & x^2+y^2 \end{bmatrix} $$

Since we are adding the symmetric matrix ${\rm I}_C$ to a symmetric matrix for parallel axis contribution, the result is also a symmetric matrix and hence diagonalizable.

  1. But you can go a step further and look at the combined translational/rotational components in a 6×6 matrix. Stack together (2) and (4) to get

    $$\begin{Bmatrix}\vec{p}\\ \vec{L}_{A} \end{Bmatrix}=\begin{bmatrix}m{\rm 1} & -m[\vec{c}\times]\\ m[\vec{c}\times] & {\rm I}_{C}-m[\vec{c}\times][\vec{c}\times] \end{bmatrix}\begin{Bmatrix}\vec{v}_{A}\\ \vec{\omega} \end{Bmatrix} \tag{6} $$

    To diagonalize the above 6×6 spatial inertia matrix, you need the off-diagonal terms to be zero which happens only on the COM, and using the basis vectors that diagonalize ${\rm I}_C$.

So the significance of the COM in dynamics is that it diagonalizes the 6×6 spatial inertia matrix which means the translational and rotational components are decoupled from each other.

So not only choosing the specific basis vectors but also the specific point of summation leads to simplified representations of the governing quantities.

And Newton's second law connects the change in momentum vectors to external forces and torques and thus the above diagonalization has implications beyond just the formulation of momenta.

PS 1. The combined translational/rotational equations of motion are called the Newton-Euler equations of motion. The simplest (diagonalized form is) $$\begin{Bmatrix}\vec{F}=m\,\vec{a}_{C}\\ \vec{\tau}_{C}={\rm I}_{C}\vec{\alpha}+\vec{\omega}\times{\rm I}_{C}\vec{\omega} \end{Bmatrix}$$

PS 2. The treatment of mechanics using combined translational/rotational vectors is called screw theory and it has strong connections to the geometry of skew lines in space.