Here's an answer to your last question. In homogeneous coordinates, $\newcommand{\P}{\mathbb P} \newcommand{\F}{\mathbb F} \P^1(\F_q)$ can be written as the set of all pairs $[X_0:X_1]$ with $X_0, X_1 \in \F_q$ not both $0$. The point $[X_0:X_1]$ corresponds to the line through the origin $X_0 x + X_1 y = 0$ in $\mathbb{A}^2(\F_q)$. Since the equations $X_0 x + X_1y = 0$ and $\lambda X_0 x + \lambda X_1 y = 0$ define the same line for any $\lambda \neq 0$, then in terms of our coordinates we have $[X_0:X_1] = [\lambda X_0 : \lambda X_1]$ for all $\lambda \in \F_q^\times$. (It is in this sense that the coordinates are homogeneous.)
Note that if $X_1 \neq 0$ we can write $[X_0:X_1] = [X_0/X_1 : 1]$ and the set
$$
\{[X_0:X_1] \in \P^1(\F_q) \mid X_1 \neq 0\} = \{[z:1] \mid z \in \F_q\}
$$
can be identified with $\F_q$. Thus we have the stratification
$$
\P^1(\F_q) = \{[z:1] \mid z \in \F_q\} \cup \{[1:0]\} = \F_q \cup \{\infty\} \, .
$$
A matrix $\gamma = \begin{pmatrix} a & b\\ c & d \end{pmatrix} \in \mathrm{PSL}_2(\F_q)$ then acts by multiplication as you stated:
$$
\begin{pmatrix} a & b\\ c & d \end{pmatrix}
\begin{pmatrix} X_0\\ X_1\end{pmatrix}
= \begin{pmatrix} aX_0 + bX_1\\ cX_0 + dX_1\end{pmatrix}
$$
or to emphasize the fact that we're still using homogeneous coordinates:
$$
\begin{pmatrix} a & b\\ c & d \end{pmatrix} [X_0 : X_1] = [aX_0 + bX_1 : cX_0 + dX_1] \, .
$$
But when $X_1 \neq 0$ we have $[X_0:X_1] = [z:1]$ where $z = X_0/X_1$, so this expression can be rewritten
\begin{align*}
\begin{pmatrix} a & b\\ c & d \end{pmatrix}
[X_0:X_1]
= [aX_0 + bX_1 : cX_0 + dX_1] = [a(X_0/X_1) + b : c(X_0/X_1) + d] = [az + b : cz + d] \, .
\end{align*}
If in addition we have $cz + d \neq 0$, then we can divide through
$$
[az + b : cz + d] = \left[\frac{az + b}{cz + d} : 1 \right]
$$
which, under the identification mentioned above, can identified with $\frac{az+b}{cz+d}$, recovering your original definition.
I may add something later about how locally $\P^k$ looks like $\mathbb{A}^k = \mathbb{F}_q^k$, whose elements are just $k$-tuples, but hopefully this at least answers your last question.
See here at "Properties", the first sentence is what you're looking for.
For a reference with a proof, you may want to check out Grätzer's Universal Algebra , or Burris and Sankappanavar's A course in universal algebra. If you want a proof I can write it up here, it's not too complicated.
Added : here's the proof.
It's easy to see that an algebra is subdirectly irreducible if and only if it has a minimum nontrivial congruence (otherwise, if the intersection of all nontrivial congruences is trivial, then this means our algebra embeds into the product of its nontrivial quotients, and this is clearly a subdirect product).
Now let $A$ be an algebra, and for $a\neq b\in A$, $\Psi(a,b)$ a maximal congruence not containing $(a,b)$ (it exists by Zorn's lemma), and $\Theta(a,b)$ the least congruence containing $(a,b)$. Then, finally put $R(a,b)= \Psi(a,b)\vee \Theta(a,b)$ ($\vee$ in the sense of congruences).
Notice that any congruence strictly containing $\Psi(a,b)$ also contains $\Theta(a,b)$ : that is by maximality of $\Psi(a,b)$ and definition of $\Theta(a,b)$. It follows that any congruence strictly containing $\Psi(a,b)$ also contains $R(a,b)$. Therefore by the correspondance theorem, "$R(a,b)$" is the minimum congruence of $A/\Psi(a,b)$ : $A/\Psi(a,b)$ is subdirrectly irreducible.
It just remains to check that $\bigcap_{a\neq b}\Psi(a,b)$ is the trivial congruence, when $A$ is not itself subdirectly irreducible .
If this is true, then $A\to \prod_{a\neq b}A/\Psi(a,b)$ will do the trick (where the indexing set is the same as above).
But well if $a\neq b$, then $(a,b)\notin \Psi(a,b)$, so $(a,b)\notin \bigcap_{c\neq d}\Psi(c,d)$, so the intersection is indeed trivial, and we are done.
Best Answer
$(1)$ Let $A$ be any matrix in $M_2(\Bbb C)$, set $h_A(z)=\dfrac{az+b}{cz+d}$. One can check that $h'_A(z)=\dfrac{\det A}{(cz+d)^2}$ which motivates taking $A\in {\rm GL}(2,\Bbb C)$. Now, if we multiply $A$ by any scalar, i.e $B=\alpha A$, then $h_A=h_B$ since in$\frac{\alpha az+\alpha b}{\alpha cz+\alpha d}$ the $\alpha$s cancel out. In particular $h_A=h_{\rm id}$ means that $A$ is a multiple of the identity, so the map $A\mapsto h_A$ has kernel $\alpha I$, and we get that the group $G$ of Möbius transformations is isomorphic to ${\rm GL}(2,\Bbb C)/\{\alpha\cdot 1\}={\rm PGL}(2,\Bbb C)$.
$(2)$ We can do something more. Since $\det A\neq 0$; we can assume that $\det A=1$ by taking $\alpha =(\det A)^{-1}$. Thus we can take $A\in {\rm SL}(2,\Bbb C)$. And consider the still surjective map $A\in {\rm SL}(2,\Bbb C)\mapsto h_A$. You can check this has kernel $\{1,-1\}$ so that $G$ is isomorphic to ${\rm SL}(2,\Bbb C)/\{\pm 1\}={\rm PSL}(2,\Bbb C)$, too.
As you can see, there isn't much mystery to this, really.