There are a couple of ways of doing this.
1) Brute force. Changing notation, write $e_1 = i, e_2 = j, e_3 = k$. Then the rules above become $e_i e_j = - e_j e_i$ for $i\not = j$ and $e_i^2 = -1$. We want to show that $(e_i e_j)e_k = e_i(e_j e_k)$, and we can at least assume without loss of generality that $i\leq j \leq k$. Some further symmetry will reduce the number of cases you have to check, but it still requires some explicit computation.
2) Introducing the Levi-Civita symbol $\epsilon_{ijk}$ and Kronecker delta $\delta_{ij}$, we can rewrite the defining rules for multipication in $Q$ as
\begin{align*}
e_i e_j = \epsilon_{ijk} e_k - \delta_{ij}
\end{align*}
(with implicit summation over $k$). Now just write out $e_i (e_j e_k)$ and $(e_i e_j) e_k$, churn through the calculus, and show that they agree. (Proving associativity for products involving the identity $1$ or the central element $-1$ is trivial.) The trick is noting that
\begin{align*}
\delta_{ij} e_j &= e_i &
\epsilon_{ijk} &= -\epsilon_{jik} &
\epsilon_{ijk}\epsilon_{pqr} &= \delta_{jq}\delta_{kr} - \delta_{jr}\delta_{kq}
\end{align*}
(with implicit summation throughout), which allows us to reduce such expressions formally.
This method is probably overkill, but you'll do this sort of thing countless times if you take a quantum mechanics class (or have done it countless times if you have taken a quantum mechanics class), so you might as well do it now.
3) So, where does this group actually come from? There's a quaternion algebra $\mathbb{H}$ defined as algebra (over $\mathbb{R}$ with basis $1, e_1, e_2, e_3$ satisfying the same relation as for $Q$. (It's clear that $Q$ is associative iff $\mathbb{H}$ is, which would be more useful to us if we knew that $\mathbb{H}$ is associative.) The action of $\mathbb{H}$ on itself induces an embedding $\rho:Q \to SU(2)$. Unwinding this map gives an embedding
\begin{align*}
1 &\to \begin{pmatrix}1 & 0 \\ 0 & 1\end{pmatrix} &
e_1 & \to \begin{pmatrix} i & 0 \\ 0 & -i\end{pmatrix} &
e_2 & \to \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} &
e_3 & \to \begin{pmatrix} 0 & i \\ i & 0 \end{pmatrix}
\end{align*}
with $\rho(-x) = -\rho(x)$ for all $x$. (See also the Pauli matrices, which is just the same computation on the physics side of things.) The point is that associativity is clear, since $SU(2)$ is a group. I'm not going to slog through the computation here, but there's a similar faithful representation $Q \to SL_2(\mathbb{F}_3)$.
In this particular case, I don't think you can get away from doing some sort of computation, but it's good to know that this group $Q$ comes from somewhere.
Best Answer
Here are two ways to tell people about the quaternion group:
$\{\,1,i,j,k,-1,-i,-j,-k\,\}$ with $ij=k$, $jk=i$, $ki=j$, $ji=-k$, $kj=-i$, $ik=-j$, $i^2=j^2=k^2=-1$.
$\{\,1,a,a^2,a^3,b,ab,a^2b,a^3b\,\}$ with $a^4=1$, $b^2=a^2$, $ba=a^3b$.
You can see they describe the same group by, for example, using the relations in the first description to show that the elements can also be given as $\{\,1,i,i^2,i^3,j,ij,i^2j,i^3j\,\}$ with $i^4=1$, $j^2=i^2$, and $ji=i^3j$.
Now your challenge is to do the same thing with a way of listing the elements as 2-by-2 matrices.
As for the question of why use this matrix and not that matrix, my advice is to try both ways and see what happens (and feel free to report back here on your findings).