If we have a system of linear equations $Ax=b$, then in case $A$ is invertible it easy to say that the solution is $x=A^{-1}b$. In all other cases, even if $A$ is invertible, the solution if it exist will be in the form $x=A^{+}b+(I-A^{+}A)w$, where $A^{+}$is called pseudoinvers of $A$, and $w$ is a vector of free parameters. My question here is how to compute $A^{+}$ for any matrix $A$? what is the way for doing that? It is easy to compute $A^{+}$ if the column are linearly independent (so that $m>=n$), $A^{+}=(A^{T}A)^{-1}A$, and also if the rows are linearly independent (so that $m<=n)$, $A^{+}=A^{T}(AA^{T})^{-1}$, but I dont know how to compute $A^{+}$ if $A$ is sequare non invertible matrix or if the columns or rows are not linearly independents. The above does not works. If anyone have any idea about computing pseudoinverse or if there is easier method for finding the solution $x$ for system of linear equations $Ax=b$, please help me in all cases, and please keep in mind that I want to find the general form for the solution if it exist not just for a particular $b$.
[Math] How to compute Pseudoinverse for any Matrix
linear algebrapseudoinverse
Related Solutions
An interpretive reformulation follows.
Initial solution
Start with a matrix $\mathbf{A}\in \mathbb{C}^{m\times n}$, and a sequence of measurements $\left\{ b_{k} \right\}_{k=1}^{m}$ such that $b\notin\mathcal{R}(\mathbf{A}^{*})$. What is the best solution vector $x\in\mathbb{C}^{n}$ for the linear system $$ \mathbf{A} x = b $$ with respect to the $2-$norm?
You resolved the matrix in terms of it's SVD components $$ \mathbf{A} = \mathbf{U} \, \Sigma \, \mathbf{V}^{*}, $$ and used them to fashion the Moore-Penrose pseudoinverse $$ \mathbf{A}^{\dagger} = \mathbf{V} \, \Sigma^{\dagger} \, \mathbf{U}^{*}, $$ to pose the least squares solution $$ x_{LS} = \mathbf{A}^{\dagger} b. $$
Weighting factors
Next, you want to stir in some weighting factors via the full rank matrix $\mathbf{W}\in\mathbb{C}^{m\times m}_{m}$. You ask how to solve $$ \mathbf{W}\mathbf{A} x = b $$ The effect of the weighting factors will change the solution. It does not effect existence or uniqueness. (For proofs post another question.)
Relabel the system matrix $$ \mathbf{\hat{A}} = \mathbf{W} \mathbf{A} $$ and follow the successful path from before: $$ \mathbf{\hat{A}}^{\dagger} = \mathbf{\hat{V}} \, \hat{\Sigma}^{\dagger} \, \mathbf{\hat{U}}^{*}, $$ to pose the least squares solution $$ x_{LS} = \mathbf{\hat{A}}^{\dagger} b. $$
This post may be of interest: multiplying a linear system by an invertible diagonal matrix
If you have more rows that columns, then all rows can NOT be linearly independent — but then, it actually does NOT matter here. You have the right answer in your second paragraph: if you know that all $N$ columns of the coefficient matrix are linearly independent, and if you know that solutions exist, then it has to be the case of a unique solution, since in this case $$\operatorname{rank}(\text{coefficient matrix})=\operatorname{rank}(\text{augmented matrix})=\text{number of variables}.$$
The extra equations in this case don't affect the number of solutions. They just happen to be extraneous, or redundant, equations. They are linearly dependent on other equations, but that in and by itself does NOT create linear dependency among variables or solutions. Here's a quick example: $$\left\{\begin{align} x&=5 \\ 2x&=10\end{align}\right.$$ You can see that in this case: $$\operatorname{rank}(\text{coefficient matrix})=\operatorname{rank}(\text{augmented matrix})=\text{number of variables}=1,$$ so the system has a unique solution, even though $N=1$ and $L=2$. The second equation is just a redundant one, a multiple of the first equation, and it can be eliminated (it will become a row of zeroes when you do Gauss-Jordan reduction).
Also, regarding your statement:
… then the row rank of the matrix can be less than $N$.
The row and column ranks of a matrix are always equal. So if you already know that the column rank of a matrix is $N$, then its row rank can't possibly be less than $N$.
Best Answer
Let $A = U \Sigma V^T$ be an SVD of $A$, i.e., $U$ and $V$ are orthogonal and $\Sigma = \mathop{\rm diag}(\sigma_1, \sigma_2, \dots, \sigma_r, 0, \dots, 0)$ is real diagonal with $\sigma_k > 0$ for all $k=1,\dots,r$. Then
$$A^+ = V \Sigma^+ U^T,$$
with
$$\Sigma^+ = \mathop{\rm diag}(\sigma_1^{-1}, \sigma_2^{-1}, \dots, \sigma_r^{-1}, 0, \dots, 0).$$