I have a time series of camera locations $L_i$ = $\langle Q_i, P_i \rangle$, with each camera location having an initially unknown global orientation quaternion $Q_i$ and global position vector $P_i$. Between each pair of adjacent camera locations $L_i$ and $L_{i+1}$, there is a known mapping $\langle R_i, T_i \rangle$ consisting of a local rotation quaternion $R_i$ and a local translation vector $T_i$ that rotate and then translate the camera from $L_{i+1}$ back to the local coordinate system of $L_i$ (i.e. $L_i$ is fixed, and the local transformations map or register $L_{i+1}$ to $L_i$). Given a point $X_{i+1}$ representing the coordinates of a specific point $X$ in the local coordinate system of $L_{i+1}$, the coordinates of $X$ may be found for the local coordinate system of $L_i$ using:
$X_i = R_i X_{i+1} R_i^{-1} + T_i$
and therefore,
$X_{i-1} = R_{i-1} (R_i X_{i+1} R_i^{-1} + T_i) R_{i-1}^{-1} + T_{i-1}$.
This rule may be applied iteratively to eventually find $X_0$, which is the coordinates of $X_{i+1}$ in the global coordinate system.
However, applying these transformations backwards, from the last camera location to the first, for a series of 3D point clouds $X_i^j$ captured from each camera location $L_i$, is an $O(N^2)$ operation in the number of camera locations $N$, since all points in each point cloud for a camera location $L_i$ have to be transformed by all $\langle R_j, T_j \rangle$ for $0 \le j \lt i$. Therefore, I would like to invert these transformations so that the "inverse" of each local transform can simply be concatenated to produce a new set of transformations $\langle Q_i, P_i \rangle$ that directly transform a point $X_i$ to $X_0$ (i.e. that directly transform from the coordinate system of $L_i$ into the coordinate system of $L_0$) via the single transform:
$X_0 = Q_i X_i Q_i^{-1} + P_i$
The core question: How can $Q_{i}$ and $P_{i}$ be computed from $\langle R_j, T_j \rangle$, for $0 \le j \lt i\,$? (Note that it is fine to assume that $L_0$ is already in the global coordinate system, rather than $L_{N-1}$, when working from the beginning of the list of transformations, i.e. $Q_0$ is the identity and $P_0$ is zero.)
The problem is that the transformations are defined as a chain back from the last camera position to the first, but I need the cumulative concatenation of transforms from the first camera position to the last. I think this requires inverting the relative order of rotation and translation: given $A_i X_i A_i^{-1} + B_i$, how do you find the equivalent transformation expressed as $C_i (X_i + D_i) C_i^{-1}$?
Best Answer
Third take. Stop changing the definitions! :)
We have a sequence of cumulative rotation unit quaternions $R_i$ and cumulative translations $T_i$ in each local coordinate system, with $Q_i$ describing the rotation from the global coordinate system to local coordinate system at step $i$, and $P_i$ being the sum total translations up to and including step $i$ in the global coordinate system:
$$Q_i = Q_{i-1} R_{i-1} \tag{1a}\label{None1a}$$ or equivalently $$Q_i = Q_{i+1} R_i^{-1} \tag{1b}\label{None1b}$$ and $$P_i = P_{i-1} + Q_{i-1} T_{i-1} Q_{i-1}^{-1} \tag{2a}\label{None2a}$$ or equivalently $$P_i = P_{i+1} - Q_i T_i Q_i^{-1} \tag{2b}\label{None2b}$$
We can use $\eqref{None1a}$ or $\eqref{None1b}$ to construct $Q_0^\prime, Q_1^\prime, \dots, Q_{N-1}^\prime, Q_{N}^\prime$. With $\eqref{None1a}$, set $Q_N^\prime = 1$ and iterate $i = N-1, N-2, \dots, 2, 1, 0$. With $\eqref{None1b}$, set $Q_0\prime = 0$ and iterate $i = 1, 2, \dots, N-1, N$.
Then, fix $Q_k = 1$ (where $0 \le k \le N$), by iterating $$Q_i = Q_k^{\prime -1} Q_i^\prime, \quad \forall i \tag{3}\label{None3}$$
After you have the orientations, you can calculate the total translations $P_i$ using $$P_0^\prime = \vec{0}, \quad P_i^\prime = P_{i-1}^\prime + Q_{i-1} T_{i-1} Q_{i-1}^{-1} \tag{4a}\label{None4a}$$ or $$P_N^\prime = \vec{0}, \quad P_i^\prime = P_{i+1}^\prime - Q_i T_i Q_i^{-1} \tag{4b}\label{None4b}$$ which will yield the same translations except for a constant difference. Again, to fix $P_k = \vec{0}$ for some $0 \le k \le N$, iterate $$P_i = P_i^\prime - P_k^\prime, \quad \forall i \tag{5}\label{None5}$$
As you can clearly see, this has $O(N)$ time complexity.
If $k = N$, use the $(a)$ methods; and if $k = 0$, the $(b)$ methods. This way the fixup is an identity operation, and can be skipped.
Note that with the fixup pass, both $(a)$ and $(b)$ work for all $k$, including $k = 0$ and $k = N$. Choosing one over the other is just an optimization.
Here is an example Python3 program, that implements two helper classes,
Vector
andVersor
, and implements the above logic. (If you save this asexample.py
, you can usepydoc3 example
to see the API it provides. It is in Public Domain, and you can use the classes in your own scripts if you put it in the same directory, and addfrom example import Vector, Versor
.)When run, the program generates ten rotations and translations, and uses the two iteration directions to solve $Q_i$ and $P_i$ with a random $k$.
The output contains first $Q_a^\prime$ and $Q_b^\prime$ before the fixup pass, then $Q_a$, $Q_b$, $P_a$, and $P_b$. If this approach works, then $Q_a = Q_b$ and $P_a = P_b$, with a selected row ($k = 0$ being the first) having $Q_a = Q_b = 1$ and $P_a = P_b = \vec{0}$. The last line reports the maximum component-wise differences between each pair of $Q_a$, $Q_b$ and $P_a$, $P_b$.
Here is the output from an example run:
Hopefully this suffices as a numerical proof that this approach works.