Component-wise averaging of similar quaternions while handling quaternion “double cover issue”

linear algebraquaternionsrotations

To average together quaternions in a well-defined way, the eigendecomposition method of Markley et al. may be used, from Averaging Quaternions, Journal of Guidance, Control, and Dynamics, 30(4):1193-1196, June 2007, Eqs. (12) and (13).

However, if a set of all quaternions are close to each other (meaning that they represent very similar rotations), then element-wise averaging of the quaternions followed by normalization may produce a sufficiently "central" quaternion. (Elementwise averaging is much faster than the eigendecomposition, which is important for some applications.)

However, the quaternions $\bf{q}$ and $\bf-{q}$ represent the same rotation (sometimes called the "double cover issue" of quaternions), so element-wise averaging cannot be applied without first somehow making sure that any quaternions that are to be averaged lie within the same "half" of the rotation group SO(3).

There are several possible methods for "standardizing" each quaternion in a set of quaternions so that the double-cover issue is not a problem, and I wrote about these in this answer, but I am not sure which of these methods is correct (or optimal, and under what assumptions). Some possible methods for standardizing all quaternions ${\bf q}_i \in Q$ (while ensuring that each quaternion still represents the same rotation) include the following:

  1. If the $w$ component is negative, negate the quaternion (i.e. replace ${\bf q}_i$ with $-{\bf q}_i$), so that the $w$ component is positive for all quaternions in the set $Q$.
  2. Take the dot product of ${\bf q}_1$ with all subsequent quaternions ${\bf q}_i$, for $2 \le i \le N$, and negate any of the subsequent quaternions whose dot product with ${\bf q}_i$ is negative.
  3. For each quaternion, measure the angle of rotation about the rotation axis of the quaternion, and normalize it so it always rotates the "short way around", i.e. such that $-\pi \le \theta \le \pi$. If it rotates the "long way around", i.e. $\theta \lt -\pi$ or $\theta \gt \pi$, then negate the quaternion.

These sometimes produce the same result, but they all produce different results in some cases (i.e. they all can negate different quaternions in a set of quaternions) — therefore they are not equivalent.

What is the best way to deal with quaternions in a standardized way in order to overcome the double cover issue in situations like this?

Note that it is not just element-wise averaging of quaternions that can cause the double cover issue to affect the results. Another example is the swing-twist decomposition: in a naive implementation, the recovered rotation component around a given axis can represent either a rotation "the short way around" or a rotation "the long way around", which can lead to some unexpected or unstable results if you care only about the rotation about the axis, not the full quaternion.

Best Answer

As in this answer, let's define $d(\mathbf p, \mathbf q) \triangleq 1 - (\mathbf p \cdot \mathbf q)^2$ to represent the dissimilarity (or "distance") between two quaternions, where $\mathbf p \cdot \mathbf q$ is the usual componentwise inner product of the quaternions treated as four-dimensional vectors.

On the assumption that we are only going to average together quaternions that represent similar orientations, let's suppose that we have a set $Q$ containing some finite positive number of unit quaternions and that there exists some unit quaternion $\mathbf q_0$ (not necessarily a member of $Q$) such that for every $\mathbf q \in Q,$

$$ d(\mathbf q_0, \mathbf q) < \frac12. \tag1 $$

For component-wise averaging to be a good method, I think we would actually want the dissimilarity to be much smaller than this bound. I chose $\frac12$ merely because it is small enough to establish a property I want. If a set $Q$ admits a tighter bound, that's fine; what follows will be just as true, but the final result may be even better.

In particular, $d(\mathbf q_0, \mathbf q) < \frac12$ implies that $\lvert \mathbf q_0 \cdot \mathbf q\rvert > \frac{\sqrt2}2$, which implies that either $\mathbf q_0 \cdot \mathbf q > \frac{\sqrt2}2$ and the angle between $\mathbf q_0$ and $\mathbf q$ is less than $\frac\pi4$, or $-\mathbf q_0 \cdot \mathbf q > \frac{\sqrt2}2$ and the angle between $-\mathbf q_0$ and $\mathbf q$ is less than $\frac\pi4$.

This also implies for any two quaternions $\mathbf p,\mathbf q \in Q,$ that $\mathbf q_0 \cdot \mathbf p$ and $\mathbf q_0 \cdot \mathbf q$ both have signs (positive or negative), that if these signs are the same then the angle between $\mathbf p$ and $\mathbf q$ is less than than $\frac\pi2$ and therefore $\mathbf p \cdot \mathbf q > 0,$ and that if the signs are opposite then the angle between $\mathbf p$ and $\mathbf q$ is greater than than $\frac\pi2$ and therefore $\mathbf p \cdot \mathbf q < 0.$

So we can partition $Q$ into two subsets: the subset $Q_+ = \{\mathbf q\in Q \mid \mathbf q_0 \cdot \mathbf q > 0\}$ and $Q_- = \{\mathbf q\in Q \mid \mathbf q_0 \cdot \mathbf q < 0\}$. Any two quaternions from one subset will have a positive dot product, whereas any two quaternions from different subsets will have a negative dot product.

Now consider method 2. If the quaternion $\mathbf q_1$ is in $Q_+$, then after replacing $\mathbf q_i$ with $-\mathbf q_i$ whenever $\mathbf q_1\cdot\mathbf q_i<0,$ all the quaternions will be in $Q_+$ and the final result of averaging these quaternions and normalizing the result will be some quaternion $\bar{\mathbf q}.$ On the other hand, $\mathbf q_1$ is in $Q_-$, then after replacing $\mathbf q_i$ with $-\mathbf q_i$ whenever $\mathbf q_1\cdot\mathbf q_i<0,$ all the quaternions will be in $Q_-$ and the final result will be $-\bar{\mathbf q},$ that is, the exact opposite of the quaternion we would have gotten if $\mathbf q_1$ were in $Q_+$, representing the exact same rotation.

Hence, given a finite set of orientations that are sufficiently similar, the final result is completely independent of which of the two possible quaternions is selected to represent each orientation. Moreover, the quaternions that are figured into the final average are all relatively close together on the $3$-sphere; whereas if you take any method that is not equivalent to this one, the difference between the method must manifest in the fact that the alternative method averages one or more quaternions from $Q_+$ with one or more quaternions from $Q_-$, which will certainly introduce worse undesired cancellation effects than using quaternions from only one subset.

I would therefore choose method 2.