Consider $\Gamma = SL_2(\mathbb{Z})$. Remember that $\gamma = \begin{pmatrix} a & b \cr c& d \end{pmatrix} \in \Gamma$ operates on the upper half plane by $T_\gamma : z \mapsto \dfrac{az+b}{cz+d}$. Write $\pi: H \to H/\Gamma$ for the quotient map.
What is a (meromorphic) differential form $\omega$ on $H/\Gamma$? It should be nothing but a (meromorphic) differential form $\tilde{\omega} := \pi^*\omega$ on $H$ which is invariant under $\Gamma$. Writing $\tilde{\omega} = f(z)dz$ this reads
$$
T_\gamma^*\tilde{\omega}
= T_\gamma^*(f(z)dz)
= f(T_\gamma(z))dT_\gamma(z)
= f\left(\dfrac{az+b}{cz+d} \right) d\left(\dfrac{az+b}{cz+d} \right)
= f(\dfrac{az+b}{cz+d}) \frac{ad-bc}{(cz+d)^2} dz
$$
So the invariance property $T_\gamma^*\tilde{\omega} = \tilde{\omega}$ translates to $f\left(\dfrac{az+b}{cz+d} \right) = (cz+d)^2f(z)$ which should be familiar.
So basically modular forms of weight 2 correspond to differential forms on the space $H/\Gamma$ parametrizing complex elliptic curves.
Obviously I omitted some important details like behaviour at infinity, so called cusps, or how to interpret higher weights modular forms (they correspond to sections $f(z)(dz)^n \in H^0(H/\Gamma,(\Omega^1_{H/\Gamma})^{\otimes n})$ of the tensor product sheaf). But I hope the principle is clear.
The definition of a modular form seems extremely unmotivated, and as @AndreaMori has pointed out, whilst the complex analytic approach gives us the quickest route to a definition, it also clouds some of what is really going on.
A good place to start is with the theory of elliptic curves, which have long been objects of geometric and arithmetic interest. One definition of an elliptic curve (over $\mathbb C$) is a quotient of $\mathbb C$ by a lattice $\Lambda = \mathbb Z\tau_1\oplus\mathbb Z\tau_2$, where $\tau_1,\tau_2\in\mathbb C$ are linearly independent over $\mathbb R$ ($\mathbb C$ and $\Lambda$ are viewed as additive groups): i.e.
$$E\cong \mathbb C/\Lambda.$$
In this viewpoint, one can study elliptic curves by studying lattices $\Lambda\subset\mathbb C$. Modular forms will correspond to certain functions of lattices, and by extension, to certain functions of elliptic curves.
Why the upper half plane?
For simplicity, since $\mathbb Z\tau_1 = \mathbb Z(-\tau_1)$, there's no harm in assuming that $\frac{\tau_1}{\tau_2}\in \mathbb H$.
What about $\mathrm{SL}_2(\mathbb Z)$?
When do $(\tau_1,\tau_2)$ and $(\tau_1',\tau_2')$ define the same lattice? Exactly when
$$(\tau_1',\tau_2')=(a\tau_1+b\tau_2,c\tau_1+d\tau_2)$$where $\begin{pmatrix}a&b\\c&d\end{pmatrix}\in\mathrm{SL}_2(\mathbb Z)$. Hence, if we want to consider functions on lattices, they had better be invariant under $\mathrm{SL}_2(\mathbb Z)$.
Functions on lattices:
Suppose we have a function $$F:\{\text{Lattices}\}\to\mathbb C.$$ First observe that multiplying a lattice by a non-zero scalar (i.e. $\lambda\Lambda$ for $\lambda\in\mathbb C^\times$) amounts to rotating and rescaling the lattice. So our function shouldn't do anything crazy to rescaled lattices.
In fact, since we really care about elliptic curves, and $\mathbb C/\Lambda\cong\mathbb C/\lambda\Lambda$ under the isomorphism $z\mapsto \lambda z$, $F$ should be completely invariant under such rescalings - i.e. we should insist that
$$F(\lambda \Lambda) = F(\Lambda).$$
However, if we define $F$ like this, we are forced to insist that $F$ has no poles. This is needlessly restrictive. So what we do instead is require that
$$F(\lambda\Lambda) = \lambda^{-k}F(\Lambda)$$
for some integer $k$; the quotient $F/G$ of two weight $k$ functions gives a fully invariant function, this time with poles allowed.
Where do modular forms come in?
If $\Lambda = \mathbb Z\tau\oplus\mathbb Z$ with $\tau\in\mathbb H$, define a function $f:\mathbb H\to\mathbb C$ by $f(\tau)=F(\Lambda)$. For a general lattice, we have
$$\begin{align}F(\mathbb Z\tau_1\oplus\mathbb Z\tau_2)&=F\left(\tau_2(\mathbb Z({\tau_1}/{\tau_2})\oplus\mathbb Z)\right)\\
&=\tau_2^{-k}f({\tau_1}/{\tau_2})
\end{align}$$
and in particular,
$$\begin{align}f(\tau) &= F(\mathbb Z\tau\oplus\mathbb Z) \\&=F(\mathbb Z(a\tau+b)\oplus\mathbb Z(c\tau+d)) &\text{by }\mathrm{SL}_2(\mathbb Z)\text{ invariance}\\&= (c\tau+d)^{-k} f\left(\frac{a\tau+b}{c\tau+d}\right).\end{align}$$
This answers your first two questions.
At this point, there's no reason to assume that condition (3) holds, and one can study such functions without assuming condition (3). However, imposing cusp conditions is a useful thing to do, as it ensures that the space of weight $k$ modular forms is finite dimensional.
To answer your fourth question, yes, and this is exactly the viewpoint taken in most research done on modular forms and their generalisations, where one considers automorphic representations.
Best Answer
It means that the map is bi-holomorphic. The map is the well-known "Cayley-transform". There is a generalization $\Phi\colon G\rightarrow {\rm Lie}(G)$ defined on groups, if you want groups, namely on certain algebraic groups and Lie groups. For this see the interesting article by Kostant and Michor.