So, let us suppose that $A$ is a square matrix, and that $B$ is a matrix such that $BA=I$. You want to show that $B$ is the unique left inverse of $A$ (that is).
Note that a system $A\mathbf{x}=\mathbf{b}$ has at most one solution, namely $B\mathbf{b}$: if $A\mathbf{x}=\mathbf{b}$, then
$$\mathbf{x} = I\mathbf{x} = BA\mathbf{x} = B\mathbf{b}.$$
If $CA=I$, then again a system $A\mathbf{x}=\mathbf{b}$ has at most one solution, namely $C\mathbf{b}$. Thus, $B\mathbf{b}=C\mathbf{b}$ for any $\mathbf{b}$ for which the system has a solution.
If we can show that $A\mathbf{x}=\mathbf{e}_i$ has a solution for each $i$, where $\mathbf{e}_i$ is the $i$th standard basis vector ($1$ in the $i$th entry, $0$s elsewhere) this will show that $B=C$, since they have the same columns.
Because $A\mathbf{x}=\mathbf{0}$ has a solution, that solution must be $B\mathbf{0}=\mathbf{0}$. That means that the reduced row-echelon form of $A$ is $I$. Because the reduced row-echelon form of $A$ is $I$, performing row reduction on the augmented coefficient matrix $[A|\mathbf{e}_i]$ yields the matrix $[I|\mathbf{y}]$ for some $\mathbf{y}$, with $\mathbf{y}$ being the solution to $A\mathbf{x}=\mathbf{e}_i$. Since this vector is equal to both $\mathbf{b}_i=B\mathbf{e}_i$ (the $i$th column of $B$) and to $\mathbf{c}_i=C\mathbf{e}_i$, as noted above, then the $i$th columns of $B$ and $C$ are equal; thus, $B=C$, and the matrix has a unique left inverse.
Now, let us suppose that $A$ is a square matrix and has a right inverse, $AB=I$. We want to show that $B$ is the unique right inverse of $A$. Taking transposes, we get $I = I^T = (AB)^T = B^TA^T$. By what was proven above, $B^T$ is the unique left inverse of $A^T$. If $AC=I$, then $C^TA^T=I^T = I$, so $C^T=B^T$, hence $C=B$. Thus, $B$ is the unique right inverse of $A$.
Gaussian Elimination helps to put a matrix in row echelon form, while Gauss-Jordan Elimination puts a matrix in reduced row echelon form. For small systems (or by hand), it is usually more convenient to use Gauss-Jordan elimination and explicitly solve for each variable represented in the matrix system. However, Gaussian elimination in itself is occasionally computationally more efficient for computers. Also, Gaussian elimination is all you need to determine the rank of a matrix (an important property of each matrix) while going through the trouble to put a matrix in reduced row echelon form is not worth it to only solve for the matrix's rank.
EDIT:
Here are some abbreviations to start off with:
REF = "Row Echelon Form". RREF = "Reduced Row Echelon Form."
In your question, you say you reduce a matrix A to a diagonal matrix where every nonzero value equals 1. For this to happen, you must perform row operations to "pivot" along each entry along the diagonal. Such row operations usually involve multiplying/dividing by nonzero scalar multiples of the row, or adding/subtracting nonzero scalar multiples of one row from another row. My interpretation of REF is just doing row operations in such a way to avoid dividing rows by their pivot values (to make the pivot become 1). If you go through each pivot (the numbers along the diagonal) and divide those rows by their leading coefficient, then you will end up in RREF. See these Khan Academy videos for worked examples.
In a system $Ax=B$, $x$ can only be solved for if $A$ is invertible. Invertible matrices have several important properties. The most useful property for your question is that their RREF is the identity matrix (a matrix with only 1's down the diagonal and 0's everywhere else). If you row-reduce a matrix and it does not become an identity matrix in RREF, then that matrix was non-invertible. Non-invertible matrices (also known as singular matrices) are not as helpful when trying to solve a system exactly.
Best Answer
Well yes! It is totally fine to use the method. Even if you have a matrix of the form $3 \times 2$ the logic of the method remains the same - that you must get the matrix to a triangular form so that in the new equivalent system of equations you have eliminated some of the variables.
The best way to see this is to try an example yourself.