Let $\mathfrak{g}$ be a Lie algebra, and let $G$ be the (unique up to isomorphism) simply-connected group with Lie algebra $\mathfrak{g}$. Here, $\mathfrak{g}$ and $G$ should be thought as "abstract", with the exponential map $\exp : \mathfrak{g} \rightarrow G$ defined in terms of the 1-parameter sub-groups in $G$. The BCH formula is fulfilled for this "abstract" exponential map, although the completly general proof is a bit lengthy. It involves deriving a formula for the derivative of the exponential mapping, then proving $\exp \big( [X,\cdot] \big) Y = \exp (X) Y \exp (-X)$ and finally deducing the BCH formula. A concise presentation can be found eg. in chapter 3 of Brian Hall's "Lie Groups, Lie Algebras, and Representations". This reference aims at being fairly accessible, at the price of restricting itself to those Lie goups $G$ which can be constructed as matrix subgroups: this is the case of most groups used in practice (in particular the Heisenberg group generated by the position/momentum operators of quantum mechanics, together with the identity operator, can be constructed as a subgroup of 3x3 real matrices).
Next, let $X \mapsto i\hat{X}$ be a representation of this Lie algebra by unbounded symmetric operators on a complex Hilbert space $\mathcal{H}$ (note that we do not even need to assume that $\hat{X}$ be essentially self-adjoint, just symmetric).
Since we are dealing with unbounded operators, we have to be a bit careful as to what we mean by a representation, because a priori $\hat{X}\hat{Y}$, $\hat{Y}\hat{X}$, and therefore $[\hat{X}, \hat{Y}]$ may fail to be well-defined. Specifically, we demand that there exists a common invariant dense domain $\mathcal{D} \subseteq \mathcal{H}$, such that, for all $X \in \mathfrak{g}$, $\hat{X}$ is defined on $\mathcal{D}$ and stabilizes $\mathcal{D}$ (ie. $\hat{X} \left\langle \mathcal{D} \right\rangle \subseteq \mathcal{D}$). Then, we can define arbitrary products of the $\hat{X}$'s on $\mathcal{D}$, and in particular commutators.
The question is then whether this representation can be exponentiated, ie:
for all $X \in \mathfrak{g}$, $\hat{X}$ is actually essentially self-adjoint;
and there exists a unitary representation $U \mapsto \hat{U}$ of the group $G$ on $\mathcal{H}$ such that, for any $X \in \mathfrak{g}$, $\widehat{\exp X} = \widehat{\exp} (i\hat{X})$. Here, I denote by $\widehat{\exp}$ the "operator" exponential map, which is defined by spectral resolution of the essentially self-adjoint operator $\hat{X}$: as you observed, in the case of unbounded operators, the exponential cannot be defined in terms of the exponential series (note that the notation $\widehat{\exp}$ is non-standard, I use it here only to prevent confusion between the various notions of exponential maps).
If this holds, the BCH formula satisfied by the "abstract" exponential map $\exp$ will be inherited by the "operator" exponential map $\widehat{\exp}$.
So, with all these preliminaries in place: when can a Lie algebra representation by symmetric unbounded operators be exponentiated? A sufficient condition (useful in practice, albeit not a necessary condition) is the Nelson criterion (lemma 9.1 of Edward Nelson, "Analytic Vectors", Annals of Mathematics, Second Series, Vol. 70, No. 3 (Nov., 1959), pp. 572-615): it demands that there exists a basis $X_1,\dots,X_n$ of $\mathfrak{g}$, a dense subdomain $\mathcal{D}_o \subseteq \mathcal{D}$, and a real $s > 0$, such that:
$$
\forall \psi \in \mathcal{D}_o,\; \sum_{m=0}^{\infty} \frac{s^m}{m!} \sum_{k_1,\dots,k_m} \left\| X_{k_1} \dots X_{k_m} \psi \right\| < \infty.
$$
This is a fairly technical result, but the intuition behind it is that this condition is precisely what you need to define $\widehat{\exp} (i\hat{X})$ over $\mathcal{D}_o$ via the exponential series, and, from there, deduce that $\hat{X}$ is indeed essentially self-adjoint (using the Stone's theorem discussed below), that this definition of $\widehat{\exp} (i\hat{X})$ coincides over $\mathcal{D}_o$ with the spectral one, and finally, that this indeed gives you a unitary representation of $G$ (matching the "abstract" BCH-formula with the one that can be proven directly using the exponential series).
In the case of the position/momentum operators of quantum mechanics, one can for example take $X_1 = \text{id}$, $X_2 = q$, $X_3 = p$ and take $\mathcal{D} = \mathcal{D}_o$ to be spanned by finite linear combination of the harmonic osciallator energy eigenstates. Then, the condition can be proven using the expression of the position/momentum operators in terms of ladder operators.
However, coming back to your specific motivation, I do not honestly think that you need all this machinery. Instead, you can use the explicit expression of the Weyl operators to prove that $t \mapsto W(tz)$ is a strongly continuous one-parameter unitary group: ie. $W(sz) W(tz) = W\big((s+t)z\big)$ and, for any $\psi \in \mathcal{H}$, $t \mapsto W(tz) \psi$ is continuous (with respect to the norm of $\mathcal{H}$; in this case, the $L_2$-norm). Then, Stone's theorem (theorem VIII.8 of Reed and Simon, "Methods of Modern Mathematical Physics", volume 1) tells you that $W(tz) = \widehat{\exp} (it\hat{X})$ with $\hat{X}$ the self-adjoint operator defined by:
$$i \hat{X} \psi = \left. \frac{d}{dt} W(tz) \psi \right|_{t=0}$$
for any $\psi$ in the dense domain $\mathcal{D}$ where this derivative exists. Using again the explicit expression of $W(tz)$ you can then check that $\hat{X} = \sqrt{2} (y\hat{q} - x\hat{p})$.
Best Answer
This answer only tried to give a picture vision of the situation, hoping that things become clear already. After the answer there is a cited reference, Free Lie Algebras, which is the better answer. (Because it is a structural answer, and the structure is beautiful, just switch to the reference and enjoy!)
First, in my pictural opinion a Lie polynomial in the alphabet with letters $A,B,C,D,\dots$ is a homogeneous polynomial in the non-commutative algebra generated by the monoid generated by these letters, we will work over $\Bbb Q$, that can be obtained in the following way.
First fix some letters (with possible repetitions) (from the alphabet) and some order, and put them in a row. For instance;
Now decide to build a "special" tree with these nodes as leaves, going "down", so decide which two neighbor letters should be Lie-condensed first, then use this as a "new letter", and go on recursively. One picture may be:
Each
*
means to get the joined nodes, and apply[ , ]
on them. I hope it is clear.Now observe that $$ \begin{aligned} Z &= F(A,B) \\ &=\log(e^Ae^B) \\ &=\log\left(\ \left(1+\frac 1{1!}A+\frac 1{2!}A^2+\dots\right) \left(1+\frac 1{1!}B+\frac 1{2!}B^2+\dots\right) \ \right) \\ &=\log\left(\ 1+\sum_{(j,k)\ne (0,0)} \frac 1{j!k!}A^jB^k \ \right) \\ &= 0+\underbrace{(A+B)}_{F_1(A,B)}+\dots \end{aligned} $$ has the $F_1$-part equal to $A+B$, a Lie polynomial, and the further homogeneous pieces are under attack.
Back to the question. Why is $F_i\left(A,\sum_j F_j(B,C)\right)$ inductively a Lie polynomial (for $i,j>1$)? Use new letters $D_j$ instead of $F_j(B,C)$ if this makes the things simpler, and let us make the picture of $F_i(A, \sum _j D_j)$. There are many terms that are involving tree collapsing rules as above in a linear combination. Now push each $\sum D_j$ from the sum down on its piece from $F_i$, till it hits the
*
, i.e. it is involved in building a Lie bracket. This bracket is linear, so we split the sum $\sum D_j$ into pieces, and work now with an individual $D_j$.If this $D_j$ is itself (inductively) given by such Lie bracket tree collapsing rules, then we are fine, formally we "move the rule to the top.
We only have problems with $F_1$, which is not really in the range of the Lie bracket tree collapsing rules. I cannot say more.
(I could not figure out which / where is the problem with the "exceptional polynomials", since working only with the homogeneous part of degree $(n+1)$, for instance for $i=1$, $j=n$, and conversely, LHS, $$ \begin{aligned} F_1(F_n(A,B),C) &=F_n(A,B)+C\ , \\ F_n(F_1(A,B),C) &=F_n(A+B,C)\ , \end{aligned} $$ and of course, now we have to start the proof.)
I am now saying some words about the hidden structure, it is a wonderful structure, enjoy it!
In the book version of Free Lie Algebra, Christophe Reutenauer, referenced fully also in Free Lie Algebras, wiki page the author is quickly introducing a structure of a Hopf algebra on $\Bbb Q\langle\langle A, B,\dots\rangle\rangle$, the free algebra on the monoid generated by the letters $A,B,\dots$, the one multiplication is the usual one, an other one is given by the shuffle product, so for instance $A \sqcup\!\!\!\sqcup B = AB-BA=[A,B]$, so shuffle product monomials are... Lie polynomials. There are two corresponding comultiplications, and using these constructions one can state structural properties. In the list of them, relevant for the present question:
Theorem 1.4 in the book, (not in the linked pdf,) characterizes a polynomial $P$ to be Lie polynomial in the following way, there are equivalent:
Theorem 3.1. is a version of the above for Lie series.
Lemma 1.7 in the book, let $\alpha $ be the antipode, mapping a word $w$ into $\pm$ the reversed word, the sign being captured from the parity of the length. Then for a Lie polynomial $P$ we have $\alpha(P)=-P$.
Theorem 3.2 in the book, let $S=1+\dots$ be a series, higher terms omitted, then there are equivalent:
Corollary 3.3, the series $S=1+\dots$, such that $\log S$ is Lie series, are building a group under multiplication, this is because of the group-like-property.
Corollary 3.4, $\log(e^Ae^B)$ is a Lie series. Because of the stability w.r.t. the multiplication above, and note that $\log e^A=A$, $\log e^B=B$, are Lie series.