Looking at a general basis will not help much, I guess, but what you have done can be completed: Note that the subspace $U = \def\<#1>{\left<#1\right>}\<A_1, A_2>$ of $\def\sl{\mathfrak{sl}(2, \mathbf R)}\sl$ generated by $A_1$ and $A_2$ is a subalgebra, as $[A_1, A_2] = A_2 \in U$. So $\sl$ has a two-dimensional subalgebra. We will prove that $\def\su{\mathfrak{su}(2,\mathbf C)}\su$ does not have any. Define $\phi \colon \su \to \mathbf R^3$ by $\phi(\sigma_i) = e_i$, where $e_i$ denotes the $i$-th standard unit vector. Then we have
$$ \phi([\sigma_i, \sigma_j]) = \phi(\epsilon_{ijk}\sigma_k)=\epsilon_{ijk}e_k = e_i \times e_j = \phi(\sigma_i) \times \phi(\sigma_j) $$
where $\epsilon_{ijk}$ is the Levi-Civita symbol and $\times$ denotes the usual cross product. Hence
$$ \phi([A, B]) = \phi(A) \times \phi(B),\qquad A,B \in \su $$
by linearity which implies that $\phi$ is an isomorphism of algebras of $\su$ with $(\mathbf R^3, \times)$. Now let $A, B \in \su$ be linearly independent, then $\phi(A)$ and $\phi(B)$ span a two-dimensional linear subspace $V$ of $\mathbf R^3$, hence $\phi(A) \times \phi(B)$ is a non-zero vector orthogonal to $V$; that is, $\phi([A,B]) \not\in V$, giving - as $\phi$ is invertible - $[A,B]\not\in \phi^{-1}[V] = \<A,B>$. So $A, B$ do not generate a two-dimensional subalgebra. As $A$ and $B$ were arbitrary, there isn't any.
Therefore $\su$ and $\sl$ aren't isomorphic.
Consider the operators $$K_{+}=\frac{e^{i\theta}}{2}{a^{\dagger}} ^2~,~ K_{-}=\frac{e^{-i\theta}}{2}a^2~,~ K_0=a^{\dagger}a+\frac{1}{2}$$
These obey the exact same commutation relations quoted above. The algebra of these operators is linear and closes amongst the operators (hence is a Lie algebra). This is a really important fact for the considerations that follow below.
Consider now a one-dimensional flow along an arbitrary curve in the Lie group space that these generators span:
$$f(\lambda)=\exp(\alpha(\lambda)K_+)\exp(\beta(\lambda)K_0)\exp(\gamma(\lambda)K_-)$$
with $$\alpha(0)=\beta(0)=\gamma(0)=0$$
Denote the finite set of generators of the group as $\bar{G}=\{G_i, i=1,..., \dim(G)\}$.
Since the algebra closes and is linear, we can find functions $\Delta_{ijk}(s)$ such that
$$e^{sG_i}G_je^{-sG_i}=\sum_{k=1}^{dim(G)}\Delta_{ijk}(s)G_k$$
We specialize to the case considered here, even though the theorem holds with very general assumptions as outlined above. Take the derivative of $f(\lambda)$ and using properties of the matrix exponential defined above and deduce that:
$$\frac{df}{d\lambda}=\Big[\alpha'(\lambda)K_++\beta'(\lambda)(e^{\alpha K_+}K_0e^{-\alpha K_+})+\gamma'(\lambda)(e^{\alpha K_+}e^{\beta K_0}K_- e^{-\beta K_0}e^{-\alpha K_+})\Big]f(\lambda)$$
It is sufficient to note that the function in brackets $\rho(\lambda)$ is linear in the generators $\rho(\lambda)=a(\lambda)K_++b(\lambda)K_0+c(\lambda)K_-$. Then the general solution to this equation is given by the "time"-ordered exponential:
$$f(\lambda)=T_{\lambda}\exp\Big(\int_{0}^{\lambda}\rho(t)dt\Big)$$
I believe that it can be shown for finite-dimensional groups again under general assumptions and convergence properties, that there must exist functions such that
$$f(\lambda)=\exp(A(\lambda)K_++B(\lambda)K_0+C(\lambda)K_-)$$
This can be perhaps justified by looking at the time-ordered exponential as a product of infinitesimally small group flows, consider the BCH formula to put all the exponentials together and use the fact that all commutators in the BCH formula close under the generator algebra and are linear in the generators. Of course, there is no guarantee that BCH converges, so the above statement should be taken with a grain of salt.
The purpose of the above analysis however is not to prove, but rather motivate the fact that the disentangled form of the squeezing operator, should it exist, it must be INDEPENDENT of the representation, and only depending on the algebra of commutators.
Fortunately, to justify the squeezing operator decomposition, a general answer to the questions posed above is not necessary. It can be shown directly that defining the flow
$$g(\lambda)=\exp(-\mu(\lambda)K_+)\exp(\lambda(K_+-K_-))\exp(\mu(\lambda)K_-)$$
it's derivative can be directly evaluated similarly to the above sketch as
$$\frac{dg}{d\lambda}=\Bigg[[(1-\mu^2-\frac{d\mu}{d\lambda})+\frac{d\mu}{d\lambda}(\mu\cosh\lambda-\sinh\lambda)^2)]K_++(\cosh^2\lambda\frac{d\mu}{d\lambda}-1)K_-+(\cosh\lambda\frac{d\mu}{d\lambda}(\mu\cosh\lambda-\sinh\lambda)-\mu)K_0\Bigg]g(\lambda)$$
and miraculously, if we set the coefficient of $K_-$ to zero- which happens for $\mu(\lambda)=\tanh(\lambda)$- we very simply get
$$\frac{dg}{d\lambda}=-\mu(\lambda)K_0g(\lambda)$$
which we can trivially integrate for the quoted result, with evidently using only commutators to reach it:
$$g(\lambda)=\exp\Big(-K_0\int_{0}^{\lambda}dt~\mu(t)\Big)=\exp\Big(-K_0\ln\cosh\lambda\Big)$$
This has been a long answer, but the tl;dr is:
1) Most statements of the form found in the OP can be generally computed by using commutators alone,
2) The above nice disentanglement formula corresponds to finding an integrable flow within the group.
I hope this is helpful, but still I would love to see an answer that puts all the above into perspective.
Best Answer
We define an inner product over $\Bbb C^{n \times n}$ by $\frac 1n \langle A,B \rangle = \operatorname{tr}(A^\dagger B)$; this is (a normalized version of what is) known as the "Frobenius" or "Hilbert-Schmidt" inner-product.
Note that for any $U$, the matrices $U\sigma_j U^\dagger$ form an orthonormal basis for the space of trace free $2 \times 2$ matrices (if you like, the orthogonal complement of the span of $I$). That is, we have $$ \langle U\sigma_jU^\dagger,U\sigma_kU^\dagger \rangle = \delta_{jk} $$ where $\delta_{jk}$ is a Kronecker-delta, and every trace-zero matrix can be written as a linear combination of these matrices.
The matrix $B_{ij}$ that you describe is the change-of-basis matrix that takes us from a coordinate-vector relative to the basis $\{U\sigma_jU^\dagger: j =1,2,3\}$ to a coordinate vector relative to the basis $\{\sigma_j: j = 1,2,3\}$. Because where are changing between two orthonormal bases, the resulting change-of-basis matrix is unitary.