An axiomatic definition of normal order can be found in the book "Solitons: Differential Equations, Symmetries and Infinite Dimensional Algebra" by T. Miwa, M. Jimbo and E. Date at page 44-46. This is the best definition I have found so far and goes as follow for products of bosonic operators. Call $\mathcal{A}$ the set of linear combinations of formal finite products of bosonic operators $b_i,\,b_i^\dagger$. The normal order ${:} a {:}$ of $a \in\mathcal{A}$ is a notation defined inductively by the properties
- Linearity, $${:} z_1a_1+z_2a_2 {:}= z_1{:} a_1 {:} + z_2 {:} a_2 {:}\quad \text{for} \quad z_1,\,z_2\in \mathbb{C}\quad \text{and}\quad a_1,\,a_2 \in \mathcal{A}$$
- ${:} 1 {:} = 1$, with $1$ the identity operator in $\mathcal{A}$
- within the dots all the operators $b_i,\,b_i^\dagger$ commute among themselves
- the annihilation operators can be taken out of the columns on the right $${:}a b_i{:} = {:} a{:}\,b_i$$
- the creation operators can be taken out of the columns on the left $${:}b_i^\dagger a{:} = b_i^\dagger\,{:} a{:}$$
The definition for fermionic operators is the same with the only difference that the fermionic operators anticommute among themselves (Property 3). It is important to stress that usually annihilation operators are those operators that annihilate a specified vacuum state, therefore the normal ordering depends on the choice of the vacuum.
The normal order is a notation and not a function that acts on operators (i.e. a superoperator). This means that, whereas $b_ib_i^\dagger$ and $b_i^\dagger b_i +1$ are the same operators according to the canonical commutations relations, they are represented as different elements of $\mathcal{A}$ which is the set of linear combinations of strings of symbols generated by $b_i,b_i^\dagger$. In mathematical terms the normal order is a function defined on the elements of of the free algebra generated by $b_i,b_i^\dagger$, but is not a well defined function on the CCR-algebra (Canonical Commutation Relations algebra). The wrong step that leads to the paradoxical result
$$b_ib_i^\dagger = b_i^\dagger b_i +1 \Rightarrow {:}b_ib_i^\dagger {:} = {:}b_i^\dagger b_i +1{:} = {:}b_i^\dagger b_i{:} + 1 = {:}b_ib_i^\dagger{:} + 1 \Rightarrow 0 = 1$$
is actually the first equality since $b_ib_i^\dagger \neq b_i^\dagger b_i +1$ in $\mathcal{A}$. In another post it was suggested that the normal product is undefined when acting on linear combinations. However this would seriously limit the usefulness of the normal order. For example it is common to take the normal order of infinite series such as exponentials. The definition as a function acting on the free algebra $\mathcal{A}$ is in fact the one employed in practice.
This is a good example of how being mathematically rigor should not be regarded as a nuisance in the physics community, but rather as an important tool to avoid misunderstanding and confusion.
The following is summarized from two notes (both can be found here) by Prof. Michael Stone (@mikestone). Confusing typos in these notes are corrected.
From a general perspective, the two set of operators $\{a\}, \{b\}$ are two realizations of the CCR/CAR. By the Stone-von Neumann theorem, there is a unitary operator $\mathcal{U}$ that acts on the many-body Hilbert space such that
$$
b_i = \mathcal{U} a_i \mathcal{U}^\dagger
\ \Rightarrow \
b^\dagger_i = \mathcal{U} a^\dagger_i \mathcal{U}^\dagger
$$
Then we find $|0_b\rangle \propto \mathcal{U}|0_a\rangle$, since
$$
a_i |0_a\rangle
= \mathcal{U}^\dagger b_j \mathcal{U} |0_a\rangle = 0
$$
However, as noted by Prof. Stone, the general form of $\mathcal{U}$ is very complicated. Thus we take another approach.
Assume that $U$ is invertible (this is not an "innocent assumption"). Then $b_i |0_b \rangle = 0$ is equivalent to
$$
0 = U^{-1}_{ij} b_j |0_b\rangle
= (a_i - S_{ij} a^\dagger_j) |0_b \rangle,
\quad
S \equiv - U^{-1} V
$$
Using the constraint $U V^\mathsf{T} = \eta V U^\mathsf{T}$, one can show that
$$
S^\mathsf{T} = \eta S
$$
i.e. the matrix $S$ is symmetric for bosons, and anti-symmetric for fermions. Then we construct the bilinear
$$
Q \equiv \frac{1}{2} \sum_{i,j}
S_{ij} a^\dagger_i a^\dagger_j
$$
and use the BCH formula to calculate
$$
\begin{equation*}
e^Q a_i e^{-Q}
= a_i + [Q,a_i]
+ \frac{1}{2}[Q,[Q,a_i]]
+ \cdots
\end{equation*}
$$
Fortunately, the commutator
$$
[Q, a_i] = - S_{ij} a^\dagger_j
$$
commutes with $Q$ (for both bosons and fermions). Therefore we simply get
$$
\begin{equation*}
e^Q a_i e^{-Q}
= a_i + [Q,a_i]
= a_i - S_{ij} a^\dagger_j
\end{equation*}
$$
and $b_i |0_b \rangle = 0$ is equivalent to
$$
e^Q a_i e^{-Q} |0_b\rangle = 0
$$
This can be satisfied by all $a_i$ if and only if
$$
e^{-Q} |0_b \rangle \propto |0_a \rangle
$$
The "if" is obvious. To show "only if", suppose that $|\psi\rangle \equiv e^{-Q} |0_b \rangle \ne |0_a \rangle$; then $e^Q$, which consists of only $\{a^\dagger_i\}$, cannot annihilate $|\psi\rangle$. Finally, up to a normalization constant $\mathcal{N}$, the vacuum $|0_b\rangle$ is
$$
|0_b\rangle = \mathcal{N} e^Q |0_a \rangle
$$
(When I have time I will update on finding $\mathcal{N}$.)
Best Answer
There is a general argument for the existence of the unitary operator. From the Stone-von Neumann theorem, if you know where $a$ is mapped, then there is only one unitary operator that can represent the transformation (up to a global phase).
You can construct this operator more explicitly. One way is to realize that your transformations form a path connected Lie group. Formally, you can view it as the subgroup of matrices of size $2N\times2N$ of the form: $$ \begin{pmatrix} A & B\\ B^* & A^* \end{pmatrix} $$ with your additional constraints: $$ A A^\dagger -BB^\dagger =1\\ AB^T-BA^T =0 $$ You can consider a path from the identity to your transformation, namely $A_t,B_t$ with $t\in[0,1]$ and: $$ \begin{align} A_0&=1 & B_0&=0 \\ A_1&=A& B_1&=B \end{align} $$ This gives a varying $a_t$ satisfying the ODE: $$ \dot a_t= \alpha_ta_t+ \beta_ta_t^\dagger $$ with the matrices $\alpha_t,\beta_t$ defined by: $$ \frac{d}{dt} \begin{pmatrix} A_t & B _t\\ B _t ^* & A _t ^* \end{pmatrix} = \begin{pmatrix} \alpha_t& \beta_t\\ \beta_t^* & \alpha_t^* \end{pmatrix} \begin{pmatrix} A_t & B _t\\ B _t ^* & A _t ^* \end{pmatrix} $$ You want to recognize the ODE as a Heisenberg equation from a suitably chosen quadratic Hamiltonian: $$ \dot a_t =-i[a,H] $$ Using the CCR’s you can check that a suitable candidate (unique up to additional multiple of identity) is: $$ H_t= i(\alpha_t)_{ij}a_i^\dagger a_j+\frac{i}{2}(\beta_t)_{ij}a_i^\dagger a_j^\dagger-\frac{i}{2}(\beta_t^*)_{ij}a_i a_j $$ $H_t$ is hermitian and gives the correct equations of motion using the relations (taking the derivatives of the constraints): $$ \alpha_t+\alpha_t^\dagger =0 \\ \beta_t-\beta_t^T=0 $$ Your unitary operator is given by the time ordered exponential: $$ U=\mathcal T \exp\left(\int_0^1dt H_t\right) $$ Actually, it turns out that the exponential is surjective on the group, so you can choose Choose a path with constant $\alpha_t,\beta_t$. This gives a constant $H_t$ and the time ordered exponential is a simple exponential.
Physically, you are constructing the unitary operator from harmonic oscillators and squeeze operators. The process is explicit in the case $N=1$.
Hope this helps.