I don' find the sentence you quoted from wikipedia so much illuminating... It's saying that Euler angles want to decompose a rotation as a product of three rotations $RPY$. It is obvious that every rotation applies to the frame of the previous one. This is exactly the good thing about Euler angles... the bad thing is that if you have a rotation which varies with continuity, the three rotations $R,P,Y$ are not guaranteed to vary continuously.
My suggestion is to think about the real gimbal lock, which might happen to a compass in a plane. Notice that the compass in a plane gives you exactly the Yaw angle. Suppose that the plane is heading north and makes a loop. When the plane passes the vertical the yaw angle passes abruptly from north to south and might cause the compass to lock in an incorrect position.
After a lot of searching, I have finally found a reference that has what I needed. Although it doesn't appear in a 'Quaternion to Euler' or 'Quaternion to Tait-Bryan' google search, when I gave up and started looking for 'Quaternion to Axis-Angle' with the intention of going through that representation as an intermediate step, I came across the following wonderful document:
Technical Concepts
Orientation, Rotation, Velocity and Acceleration,
and the SRM
Version 2.0, 20 June 2008
Author: Paul Berner, PhD
Contributors: Ralph Toms, PhD, Kevin Trott, Farid Mamaghani, David Shen,
Craig Rollins, Edward Powell, PhD
http://www.sedris.org/wg8home/Documents/WG80485.pdf
It covers a lot of the formalisms, but most importantly, shows derivations and solutions for 3-1-3 and 3-2-1 Euler angle representation. It also seems to cover inter-conversion between pretty much every other rotation representation I'm aware of, and so I would also recommend it as a good general reference.
Oh, and the actual solution for a 3-2-1 ($z-y-x$) Tait-Bryan rotation convention from that reference:
$$
\phi = \operatorname{arctan2}\left(q_2 q_3 + q_0 q_1,\frac{1}{2}-(q_1^2 + q_2^2)\right) \\
\theta = \arcsin(-2(q_1 q_3 - q_0 q_2)) \\
\psi = \operatorname{arctan2}\left(q_1 q_2 + q_0 q_3,\frac{1}{2}-(q_2^2 + q_3^2)\right)
$$
Note that the gimbal-lock situation occurs when $2(q_1 q_3 + q_0 q_2) = \pm1$ (which gives a $\theta$ of $\pm \frac{\pi}{2} $), so it can be clearly identified before you attempt to evaluate $\phi$ and $\psi$.
(Convention for arctan2 is $\operatorname{arctan2}(y, x)$, as hinted on page 3.)
Best Answer
This is not an answer, only some advice on how to utilize versors; unit quaternions that represent rotations in three dimensions.
In computer graphics, both 3×3 and 4×4 matrices are used to represent 3D transforms.
Point $\vec{p} = \left [ \matrix { x \\ y \\ z } \right ]$ is transformed by matrix $\mathbf{M} = \left [ \matrix { x_x & y_x & z_x \\ x_y & y_y & z_y \\ x_z & y_z & z_z } \right ] $ via multiplication:
$$\vec{p}' = \mathbf{M} \vec{p}$$
which means that
$$\begin{cases} x' = x \; x_x + y \; y_x + z \; z_x \\ y' = x \; x_y + y \; y_y + z \; z_y \\ z' = x \; x_z + y \; y_z + z \; z_z \end{cases}$$
Note that the three vectors, $\hat{e}_x = \left [ \matrix { x_x \\ x_y \\ x_z } \right ]$, $\hat{e}_y = \left [ \matrix { y_x \\ y_y \\ y_z } \right ]$, and $\hat{e}_z = \left [ \matrix { z_x \\ z_y \\ z_z } \right ]$, are also the unit vectors in the rotated coordinate system.
Because pure rotation matrixes are orthogonal, the inverse of $\mathbf{M}$ is its transpose, i.e. $\mathbf{M}^{-1} = \mathbf{M}^{T} = \left [ \matrix { x_x & x_y & x_z \\ y_x & y_y & y_z \\ z_x & z_y & z_x } \right ]$.
If 4×4 matrices are used, then points are implicitly $\vec{p} = \left [ \matrix { x \\ y \\ z \\ 1 } \right ]$, and the transformation matrix is $\mathbf{M} = \left [ \matrix { x_x & y_x & z_x & t_x \\ x_y & y_y & z_y & t_y \\ x_z & y_z & z_z & t_z \\ 0 & 0 & 1} \right ] $, where the vector $\vec{T} = \left [ \matrix { t_x \\ t_y \\ t_z } \right ]$ is translation after rotation.
I've shown here how to use the translation part $\vec{T}$ to define the matrix as a rotation around some other point than origin, and how to combine translation before and after the rotation into $\vec{T}$.
In the 4×4 matrix case, very little changes programming-wise:
$$\begin{cases} x' = x \; x_x + y \; y_x + z \; z_x + t_x \\ y' = x \; x_y + y \; y_y + z \; z_y + t_y \\ z' = x \; x_z + y \; y_z + z \; z_z + t_z \end{cases}$$
Note that the order in which the matrix elements are stored, varies. Multidimensional arrays in Fortran use column-major order, whereas in C they use row-major order. If you use a 3D toolkit, check its documentation to verify.
Quaternion $q = (w, x, y, z)$ is an unit quaternion or a versor, if $w^2 + x^2 + y^2 + z^2 = 1$.
The versor $q$ corresponds to a rotation matrix $\mathbf{M}$, $$\mathbf{M} = \left[\begin{matrix} 1 - 2 (y^2 + z^2) & 2 (x y - z w) & 2 (x z + y w) \\ 2 (x y + z w) & 1 - 2 (x^2 + z^2) & 2 (y z - x w) \\ 2 (x z - y w) & 2 (y z + x w) & 1 - 2 (x^2 + y^2) \end{matrix}\right]$$
Versor $q$ also corresponds to a rotation around an axis.
Let $\hat{a} = (a_x, a_y, a_z)$ be the axis, and its norm 1, i.e. $a_x^2 + a_y^2 + a_z^2 = 1$. Let $\theta$ be the rotation around that axis. Then,
$$\begin{cases} w = \cos\frac{\theta}{2} \\ x = a_x \sin\frac{\theta}{2} \\ y = a_y \sin\frac{\theta}{2} \\ z = a_z \sin\frac{\theta}{2} \end{cases}$$
To avoid numerical errors, you can always normalize the quaternion.
This is a safe operation (does not introduce any bias in any direction), and is very useful when multiplying quaternions many times using e.g. floating-point representation.
First, calculate $n = \sqrt{w^2 + x^2 + y^2 + z^2}$. Then, $$\begin{cases} w' = \frac{w}{n} \\ x' = \frac{x}{n} \\ y' = \frac{y}{n} \\ z' = \frac{z}{n} \end{cases}$$ where $q' = (w', x', y', z')$ is the normalized versor.
Basically, if you write a program that uses versors, you do this every set of operations, to avoid numerical errors from creeping in.
You can interpolate between two versors, by scaling and adding them together, and normalizing the result (as in step 4. above).
For example, if you have $0 \le p \le 1$ describing "phase", or state between rotations $q_0$ and $q_1$, then
$$\begin{cases} w' = (1-p) w_0 + p w_1 \\ x' = (1-p) x_0 + p x_1 \\ y' = (1-p) y_0 + p y_1 \\ z' = (1-p) z_0 + p z_1\end{cases}$$
and with $n' = \sqrt{w'^2 + x'^2 + y'^2 + z'^2}$, you can compute the desired rotation with $w = w'/n'$, $x = x'/n'$, $y = y'/n'$, and $z = z'/n'$.
Rudders, ailerons, and attitude thrusters, can be represented by the axis and angle they try to rotate the vessel around.
The axis does not change (unless the attitude control element suffers damage), but the angle is dependent on both the force and the duration.
For example, consider a fixed wing aircraft with elevators (controlling pitch), ailerons (controlling roll) and a rudder (controlling yaw).
To simulate their effect on the current orientation of the plane, a programmer would have one quaternion ($q$) representing the orientation of the plane compared to some fixed reference (world coordinates), as well as three variables describing the state of the three control surfaces ($p$, $r$, $a$, say all within -1 and +1).
To update the orientation, we need to know how long the state has been applied; i.e. the length of the time step, $\Delta t$. We also need to know how large a rotation (in radians), in one time unit, each control surface can affect (say, $C_p$, $C_r$, $C_a$). Then, we can calculate $\theta_p = p \; C_p \; \Delta t$, $\theta_r = r \; C_r \; \Delta t$, and $\theta_a = a \; C_a \; \Delta t$.
With these, the programmer can use point 3. above to calculate the versor representing each of the three effects (the axis being the rotation axis the control surface rotates the aircraft around).
The new orientation is obtained by multiplying the current orientation versor by the three control surface versors, and normalizing the result (step 4. above).
Note that here, we successively apply the rotation caused by each control surface. If the time steps (and thus changes in orientation) are small -- and you do need them to be small, in any kind of simulation! --, this works just fine. Mathematically speaking, the order in which we multiply the quaternions matters: the first one is applied to the old orientation, but the second one is applied to the resulting orientation -- as if the first control surface affected the second (and both affect the third). This has nothing to do with quaternions per se, but only on how we model the effect of the control surfaces to the current orientation.