There's no escaping Lie theory if you want to understand what is going on mathematically. I'll try to provide some intuitive pictures for what is going on in the footnotes, though I'm not sure if it will be what you are looking for.
On any (finite-dimensional, for simplicity) vector space, the group of unitary operators is the Lie group $\mathrm{U}(N)$, which is connected. Lie groups are manifolds, i.e. things that locally look like $\mathbb{R}^N$, and as such possess tangent spaces at every point spanned by the derivatives of their coordinates — or, equivalently, by all possible directions of paths at that point. These directions form, at $g \in \mathrm{U}(N)$, the $N$-dimensional vector space $T_g \mathrm{U}(N)$.1
Canonically, we take the tangent space at the identity $\mathbf{1} \in \mathrm{U}(N)$ and call it the Lie algebra $\mathfrak{g} \cong T_\mathbf{1}\mathrm{U}(N)$. Now, from tangent spaces, there is something called the exponential map to the manifold itself. It is a fact that, for compact groups, such as the unitary group, said map is surjective onto the part containing the identity.2 It is a further fact that the unitary group is connected, meaning that it has no parts not connected to the identity, so the exponential map $\mathfrak{u}(N) \to \mathrm{U}(N)$ is surjective, and hence every unitary operator is the exponential of some Lie algebra element.3 (The exponential map is always surjective locally, so we are in principle able to find exponential forms for other operators, too)
So, the above (and the notes) answers to your first three questions: We can always represent a unitary operator like that since $\mathrm{U}(N)$ is compact and connected, the exponential of an operator means "walking in the direction specified by that operator", and while $\mathcal{U}$ lies in the Lie group, $\mathcal{T}$ lies, as its generator, in the Lie algebra. One also says that $\mathcal{T}$ is the infinitesimal generator of $\mathcal{U}$, since, in $\mathrm{e}^{\alpha \mathcal{T}}$, we can see it as giving only the direction of the operation, while $\alpha$ tells us how far from the identity the generated exponetial will lie.
The physical meaning is a difficult thing to tell generally - often, it will be that the $\mathcal{T}$ is a generator of a symmetry, and the unitary operator $\mathcal{U}$ is the finite version of that symmetry, for example, the Hamiltonian $H$ generates the time translation $U$, the angular momenta $L_i$ generate the rotations $\mathrm{SO}(3)$, and so on, and so forth — the generator is always the infinitesimal version of the exponentiated operator in the sense that
$$ \mathrm{e}^{\epsilon T} = 1 + \epsilon T + \mathcal{O}(\epsilon^2)$$
so the generated operator will, for small $\epsilon$ be displaced from the identity by almost exactly $\epsilon T$.
1 Think of the circle (which is $\mathrm{U}(1)$): At every point on the circle, you can draw the tangent to it - which is $\mathbb{R}$, a 1D vector space. The length of the tangent vector specifies "how fast" the path in that direction will be traversed.
2 Think of the two-dimensional sphere (which is, sadly, not a Lie group, but illustrative for the exponential map). Take the tangent space at one point and imagine you are actually holding a sheet of paper next to a sphere. Now "crumble" the paper around the sphere. You will end up covering the whole sphere, and if the paper is large enough (it would have to be infinte to represent the tangent space), you can even wind it around the sphere multiple times, thus showing that the exponential map cannot be injective, but is easily seen to be surjective. A more precise notion of this crumbling would be to fix some measure of length on the sphere and map every vector in the algebra to a point on the sphere by walking into the direction indicated by the vector exactly as far as its length tells you.
3 This is quite easy to understand - if there were some part of the group wholly disconnected to our group, or if our group had infinite volume (if it was non-compact), we could not hope to cover it wholly with only one sheet of paper, no matter how large.
No we cannot, since the only requirement$$\mathscr{P}^{-1}\hat{\textbf{x}}\mathscr{P}=-\hat{\textbf{x}}$$ does not fix the parity operator uniquely even in the simplest case. Further information with the form of added requirements is necessary to fix the parity operator.
The definition of parity operator actually depends on the system you are considering. Let us consider the simplest spin-zero particle in QM.
Its Hilbert space is (isomorphic to) $L^2(\mathbb R^3)$.
Parity is supposed to be a symmetry, so in view of Wigner's theorem, it is an operator $H: L^2(\mathbb R^3) \to L^2(\mathbb R^3)$ which may be either unitary or antiunitary.
Here the parity operator is fixed by a pair of natural requirements, the former is just that in the initial question, the latter added requirement concerns the momentum operators.
$$UX_kU^{-1} =-X_k\quad, k=1,2,3 \tag{1}$$
and
$$UP_kU^{-1} =-P_k\quad, k=1,2,3 \tag{2}$$
Notice that (2) is independent from (1), we could define operators satisfying (1) but not (2).
First of all, these requirements decide the unitary/antiunitary character. Indeed, from CCR,
$$[X_k,P_h] = i\delta_{hk}I\tag{3}$$
we have
$$U[X_k,P_h] U^{-1} = \delta_{kh} Ui IU^{-1} = \pm i \delta_{kh}I$$
that is
$$[UX_kU^{-1}, UP_hU^{-1}]=\pm i \delta_{kh}I\:,$$
so that, from (1) and (2),
$$[X_k,P_h] = \pm i \delta_{kh}I$$
Comparing with (3), this identity rules out the minus sign corresponding to an antiunitary operator. $U$ must be unitary.
Let us prove that (1)-(2) fix $U$ up to a phase. It is not possible to define $U$ more precisely because this arbitrary phase is just the degree of freadom permitted by Wigner's theorem in defining symmetries in terms of unitary or anti unitary operators.
Suppose that, for another unitary operator $V$, we also have
$$VX_kV^{-1} =-X_k\quad, k=1,2,3 \tag{1'}$$
and
$$VP_kV^{-1} =-P_k\quad, k=1,2,3 \tag{2'}\:.$$
As a consequence of (1) and (2),
$$U^{-1}VX_kV^{-1}U =X_k\quad, k=1,2,3 $$
and
$$U^{-1}VP_kV^{-1}U = P_k\quad, k=1,2,3 \:.$$
In other words, $L:= U^{-1}V$ satisfies
$$L X_k = X_kL\:, \quad L P_k = P_kL\quad, k=1,2,3 \:.$$
Since the system of operators $X_k$ and $P_k$ is irreducible in $L^2(\mathbb R^3)$, Schur's lemma implies that
$$L= e^{i\gamma}I$$
for some fixed real $\gamma$. Namely,
$$V = e^{i\gamma}U\:.$$
To conclude, it is enough to find an unitary operator satisfying both (1) and (2). Per direct inspection one sees that
$$(U\psi)(x) = \psi(-x) \tag{4}$$
does the job.
All remaining possibilities are included in the arbitrary phase $e^{i\gamma}$.
Choice (4) has a nice further property shared with only the other possibility
$$(U\psi)(x) = -\psi(-x) \tag{4'}$$
In both situations (and only for these choices of the phase),
$$UU=I\:.$$
Since we already know that $U^{-1}=U^\dagger$, we conclude that
$$U= U^{-1}= U^\dagger\:.$$
In other words the said choices of the phase make $U$ an observable, which plays an important role in particle physics (it can be conserved or not depending on the Hamiltonian).
Extending the notion of particle by including the spin, the Hilbert space enlarges to $L^2(\mathbb R^3)\otimes {\mathbb C}^{2s+1}$, where the second factor includes an irreducible representation of $SU(2)$ generated by the three spin operators $S_k$, $k=1,2,3$. An analysis similar to the previous one can be developed by adding to (1) and (2) the further requirement
$$U S_kU^{-1} = S_k\:, \quad k =1,2,3,$$
together with spin commutation relations
$$[S_k,S_h]= i \sum_{p=1}^3\epsilon_{khp} S_p\:,$$
but I stop here.
Best Answer
The precise statement of "self-adjoint operators generate continuous unitary symmetries" is Stone's theorem. It guarantees that there is a bijection between self-adjoint operators $O$ on a Hilbert space and unitary strongly continuous one-parameter groups $U(t)$ that is given by $O\mapsto \mathrm{e}^{\mathrm{i}tO}$.
The definition of the exponential for an unbounded self-adjoint operator requires theorems from Borel functional calculus that say that for every measurable function $f$ on the reals the expression $f(O)$ for $O$ a self-adjoint operator defines a unique operator with the property that $f(O)v_\lambda = f(\lambda) v_\lambda$ for every eigenstate $v_\lambda$ with eigenvalue $\lambda$. Naively, you might even take this as the definition of $f(O)$.
You can find the proofs of these assertions for example in chapter VIII of "Methods of Modern Mathematical Physics" by Reed and Simon.