There is a good reason to use $z$ instead of $e^{sT}$. Before starting with any analysis, let me remind you that in analysis of signals and systems we are interested in analyzing the frequency spectrum of the signal, i.e. the Laplace transform on the imaginary line $s = jw$. And since your signal $x[n]$ is discrete, then its frequency spectrum its periodic, so its more general define $s=jwT$.
Now, let $X(z) = \mathcal{Z}\{x[n]\}$ of a causal or non-causal discrete signal $x[n]$, i.e.
$$ X(z) = \sum_{n=-\infty}^{\infty} x[n] z^{-n}. $$
Since $z\in\mathbb{C}$ we have $z = |z| e^{j\arg z}$. Without loss of generality we rewrite $|z| = r$ and $\arg z = wT$, i.e. $z=r e^{jwT}$ (note that not necessarily $r=1$). Then
$$ \begin{aligned}
X(z) &= \sum_{n=-\infty}^{\infty} x[n] z^{-n}\\
&= \sum_{n=-\infty}^{\infty} x[n] (r e^{jwT})^{-n}\\
% &= \sum_{n=-\infty}^{\infty} (x[n] r^{-n}) e^{-njwT}\\
&= \sum_{n=-\infty}^{\infty} (x[n] r^{-n}) (e^{jwT})^{-n},
\end{aligned} $$
which implies $X(z) = \left. \mathcal{L}\{x[n]r^{-n}\} \right\rvert_{s=jwT} = \mathcal{F}\{x[n]r^{-n}\}$. As a consequence, $X(z)$ is a Fourier transform more generic than the Fourier transform $X(e^{jwT}) = \mathcal{F}\{x[n]\}$ of our signal of interest.
So, if the convergence radius of $X(z)$ is less than unity then $X(e^{jwT})$ does not exist and therefore its Fourier transform does not either, which represents a problem because there are many signals with this problem of convergence, e.g. non-causal signals such as a digital image filter. Therefore, it is convenient (and even necessary in non-causal signals) to use the Z-transform.
Or informally, use $z$ instead of $\left. e^{sT} \right\rvert_{s=jwT} = e^{jwT}$ whenever you can.
We also recommend to see this link about radius convergence.
At this point, it is clear that the Z-transform has the same objective as the Laplace transform: ensure the convergence of the transform in some region of $\mathbb{C}$, where the Z-transform does it for discrete signals and Laplace transform for continuous signals.
Best Answer
The reason behind this is not a mathematical reason but rather is an attempt to give an application to the Laplace transform to the analysis of physical systems. This is because any real physical system must necessarily be a casual system.
Straight to the point, these are the answers to your questions:
The use of unilateral or bilateral transform should be done with extreme care, depending on the type of causality of the system being analyzed:
The choice of using the Fourier transform instead of the Laplace transform, is fully valid. But remember three key things:
If you do not know or fully understand what I'm talking about, let me explain...
The Black Box Model
Suppose you have a black box in which a signal $x(t)$ is inputted. This signal is processed in the black box and produces an output signal $y(t)$. Then it says that this black box processes the input signal and it produces the output signal, both in the time domain. Note that the black box model can be used to model any type of system. There are no limitations on it. This black box can be modeled mathematically with a function $h(t)$ (also in the time domain) so that you can set the output signal $y(t)$ by convolution process of the system function $h(t)$ and input signal $x(t)$. Algebraically: $$ y(t) = h(t) \ast x(t) $$
How the Black Box Model is related to the Laplace and Fourier transforms?
The relationship between the Laplace/Fourier transform and convolution is the following property: $$ y(t) = h(t) \ast x(t) \quad\xrightarrow{\mathcal{F}}\quad Y(\omega) = H(\omega) X(\omega) $$ Is interpreted as the transform of a convolution between $h(t)$ and $x(t)$ is multiplication of $H(\omega)$ and $X(\omega)$. So, the convolution in the time domain becomes a multiplication in the frequency domain. This is very useful since compute a multiplication is much simpler than computing a convolution.
Note that there is no restriction on using the Fourier transform or the Laplace transform to do this. However, the definition of the Laplace transform is more generic with respect to the Fourier transform, since $s=\sigma + \omega i$. $$ \begin{align} \mathcal{L}\left\{ f(t) \right\} &\triangleq \int_{-\infty}^{+\infty} e^{-st} f(t)dt = F(s) &&& \mathcal{F}\left\{ f(t) \right\} &\triangleq \int_{-\infty}^{+\infty} e^{-j\omega t} f(t)dt = F(\omega) \end{align} $$ $$ \Longrightarrow\quad \begin{matrix} f(t) & \quad\xrightarrow{\mathcal{L}}\quad & F(s) \\ f(t) & \quad\xrightarrow{\mathcal{F}}\quad & F(\omega) \end{matrix} \quad\therefore\quad \left. F(s) \right|_{\sigma = 0} = F(\omega) $$ For this reason, the Laplace transform is used. In addition, the variable $ s $ provides additional information because:
$$ y(t) = h(t) \ast x(t) \quad\xrightarrow{\mathcal{L}}\quad Y(s) = H(s) X(s) $$ Therefore, the Laplace transform is a powerful tool to analyze a general system, and even allows obtain the function $h(t)$ and/or $y(t)$ using the inverse transform with the following procedure:
How does all this is related with unilateral or bilateral definition of the Laplace transform?
There is a property of physical systems called causality, where the output of the black box depends on past and current input but not future inputs. This idea is intuitive. A clear example is the following: the weather tomorrow will depend of present and past conditions, but can never depend on the weather the day after tomorrow. In the Wikipedia link, this idea is better explained.
And what influences a system is causal or not? That it influences the definition of the system function $h(t)$: $$ h(t) \mbox{ is a casual system} \quad\Leftrightarrow\quad \forall t\in\mathbb{R},\, t < 0:\quad h(t) = 0 $$
Therefore, when computing the Laplace transform of a causal system, you get the unilateral definition of the Laplace transform: $$ \mathcal{L}\left\{ h(t) \right\} \triangleq \underbrace{\int_{-\infty}^{+\infty} e^{-st} h(t)dt}_{\mbox{Bilateral Def.}} = \underbrace{\int_{0}^{+\infty}e^{-st} h(t)dt }_{\mbox{Unilateral Def.}} $$ Note that it is extremely important force analysis of a physical system to a causal system, especially since the system function $h(t)$ can be mathematically defined for $t <0$. For this reason, it applied directly the unilateral Laplace transform for real physical systems (or more generally speaking, for any casual system).
There are also no-causal systems (such as software that digitally processes an image) where the system state may depend on future states of the system ("future" pixels). In these cases, you must necessarily apply the bilateral transform.