I found the answer in this book (in Section $6.4.14$, “Determinants, Ranks and Linear Equations”). I'd tried using a similar Laplace expansion myself but was missing the idea of using the largest dimension at which the minors are not all annihilated by the same non-zero element. I'll try to summarize the argument in somewhat less formal terms, omitting the tangential material included in the book.
Let $A$ be an $m\times n$ matrix over a commutative ring $R$. We want to find a condition for the system of equations $Ax=0$ with $x\in R^n$ to have a non-trivial solution. If $R$ is a field, various definitions of the rank of $A$ coincide, including the column rank (the dimension of the column space), the row rank (the dimension of the row space) and the determinantal rank (the dimension of the lowest non-zero minor). This is not the case for a general commutative ring. It turns out that for our present purposes a useful generalization of rank is the largest integer $k$ such that there is no non-zero element of $R$ that annihilates all minors of dimension $k$, with $k=0$ if there is no such integer.
We want to show that $Ax=0$ has a non-trivial solution if and only if $k\lt n$.
If $k=0$, there is a non-zero element $r\in R$ which annihilates all matrix elements (the minors of dimension $1$), so there is a non-trivial solution
$$A\pmatrix{r\\\vdots\\r}=0\;.$$
Now assume $0\lt k\lt n$. If $m\lt n$, we can add rows of zeros to $A$ without changing $k$ or the solution set, so we can assume $k\lt n\le m$. There is some non-zero element $r\in R$ that annihilates all minors of dimension $k+1$, and there is a minor of dimension $k$ that isn't annihilated by $r$. Without loss of generality, assume that this is the minor of the first $k$ rows and columns. Now consider the matrix formed of the first $k+1$ rows and columns of $A$, and form a solution $x$ from the $(k+1)$-th column of its adjugate by multiplying it by $r$ and padding it with zeros. By construction, the first $k$ entries of $Ax$ are determinants of a matrix with two equal rows, and thus vanish; the remaining entries are each $r$ times a minor of dimension $k+1$, and thus also vanish. But the $(k+1)$-th entry of this solution is non-zero, being $r$ times the minor of the first $k$ rows and columns, which isn't annihilated by $r$. Thus we have constructed a non-trivial solution.
In summary, if $k\lt n$, there is a non-trivial solution to $Ax=0$.
Now assume conversely that there is such a solution $x$. If $n\gt m$, there are no minors of dimension $n$, so $k\lt n$. Thus we can assume $n\le m$. The minors of dimension $n$ are the determinants of matrices $B$ formed by choosing any $n$ rows of $A$. Since each row of $A$ times $x$ is $0$, we have $Bx=0$, and then multiplying by the adjugate of $B$ yields $\det B x=0$. Since there is at least one non-zero entry in the non-trivial solution $x$, there is at least one non-zero element of $R$ that annihilates all minors of size $n$, and thus $k\lt n$.
Specializing to the case $m=n$ of square matrices, we can conclude:
A system of linear equations $Ax=0$ with a square $n\times n$ matrix
$A$ over a commutative ring $R$ has a non-trivial solution if and only
if its determinant (its only minor of dimension $n$) is annihilated by
some non-zero element of $R$, that is, if its determinant is a zero divisor or zero.
The fraction ring (localization) $\rm\,S^{-1} R\,$ is, conceptually, the universal way of adjoining inverses of $\rm\,S\,$ to $\rm\,R.\,$ The simplest way to construct it is $\rm\,S^{-1} R = R[x_i]/(s_i x_i - 1).\,$ This allows one to exploit the universal properties of quotient rings and polynomial rings to quickly construct and derive the basic properties of localizations (avoiding the many tedious verifications always "left for the reader" in the more commonly presented pair approach). For details of this folklore see e.g. the exposition in section 11.1 of Rotman's Advanced Modern Algebra, or Voloch, Rings of fractions the hard way.
Likely Voloch's title is a joke - since said presentation-based method is by far the easiest approach. In fact both Rotman's and Voloch's expositions can be simplified. Namely, the only nonobvious step in this approach is computing the kernel of $\rm\, R\to S^{-1} R,\,$ for which there is a nice trick:
$\quad \begin{eqnarray}\rm n = deg\, f\quad and\quad r &=&\rm (1\!-\!sx)\,f(x) &\Rightarrow&\ \rm f(0) = r\qquad\,\ \ \ via\ \ coef\ x^0 \\
\rm\Rightarrow\ (1\!+\!sx\!+\dots+\!(sx)^n)\, r &=&\rm (1\!-\!(sx)^{n+1})\, f(x) &\Rightarrow&\ \rm f(0)\,s^{n+1}\! = 0\quad via\ \ coef\ x^{n+1} \\
& & &\Rightarrow&\ \rm\quad r\ s^{n+1} = 0
\end{eqnarray}$
Therefore, if $\rm\,s\,$ is not a zero-divisor, then $\rm\,r = 0,\,$ so $\rm\, R\to S^{-1} R\,$ is an injection.
For cultural background, for an outstanding introduction to universal ideas see George Bergman's An Invitation to General Algebra and Universal Constructions.
You might also find illuminating Paul Cohn's historical article Localization in general rings, a historical survey - as well as other papers in that volume: Ranicki, A.(ed). Noncommutative localization in algebra and topology. ICMS 2002.
Best Answer
[Disclaimer: I have to assume that you know a little about finite-dimensional vector spaces over a field. I don't see how you could approach this question otherwise.]
If $R$ is a field, then $M_n(R)$ is a vector space over $R$ under matrix addition and scalar multiplication. For each $A \in M_n(R)$, the mapping $L_A : M_n(R) \to M_n(R)$ such that $L_A(B) = AB$ for all $B$ is a linear transformation from $M_n(R)$ to itself. If $A$ is not a zero divisor, then $L_A$ is injective (because $L_A(B) = AB = 0$ iff only $B = 0$, implying that $\ker(L_A) = \{0\}$). But $M_n(R)$ is finite dimensional (with dimension $n^2)$, so if $L_A$ is injective, it is also surjective. This implies that there is some $A' \in M_n(R)$ with $AA' = L_A(A') = I$ (where $I$ is the unit matrix). So we have that (i):
$$AA' = I$$
and that (ii):
$$L_A(A'A) = A(A'A) = (AA')A = IA = A = AI = L_A(I)$$
and so as $L_A$ is injective, (ii) gives us that (iii):
$$A'A = I$$
(i) and (iii) give us that $A$ is invertible with inverse $A'$.
The comments have dealt with things that can go wrong if $R$ is not assumed to be a field. E.g., if $R$ is only assumed to be an integral domain, $A$ could represent scalar multiplication by a non-zero scalar with no inverse.