Thanks for those kindly person answered or commented on my question. It's helpful.
I find 2 ways to solve my problem.
1.The RV coefficient.
Take each column of the matrix as an independent realization of a random vector. So, if I want to calculate matrix $A_1$ and $A_2$, where $A_1 \in R^{n*k}$,$A_2 \in R^{m*k}$, $m,n \in N^+$, I turn this problem into calculating the dependence of two random vectors $\mathbf{a_1}$, and $\mathbf{a_2}$, where $\mathbf{a_1} \in R^n$, $\mathbf{a_2} \in R^m$.
and $A_{1} \in R^{n*k}$ ,$A_{2} \in R^{m*k}$ represent k independent realizations of the random vectors and are assumed to be centered.
The correlation coefficient is defined as following:
$$ RV(X,Y)=\frac{tr(XX^{'}YY^{'})}{\sqrt{tr(XX^{'})^2tr(YY^{'})^2}}$$
substitute $X= A_{1}^{'}$, $Y= A_{2}^{'}$, then get the linear dependency.
However, this efficient can only measure the linear dependency of 2 random vectors, so even if the efficient equals zero, you can only say 2 vectors have no linear relationship between each other.
2.The dCov efficient
This efficient can be applied to two matrices with different size of both row and column.
Definition of the empirical distance covariance:
$$ dCov_n^{2}(X,Y)=\frac{1}{n^{2}} \sum_{i,j=1}^{n} (d_{ij}^X-d_{i.}^{X}-d_{.j}^{X}+d_{..}^{X})(d_{ij}^Y-d_{i.}^{Y}-d_{.j}^{Y}+d_{..}^{Y}) $$
where $d_{ij}$ is the Euclidean distance between sample $i$ and $j$ of random vector $\mathbf{a_i}$, $d_{i.}= \frac{1}{n}\sum_{j=1}^{n}d_{ij}$, $d_{.j}= \frac{1}{n}\sum_{i=1}^{n}d_{ij}$, $d_{..}= \frac{1}{n^2}\sum_{i,j=1}^{n}d_{ij}$.
The empirical distance correlation:
$$dCor_n^{2}(X,Y)=\frac{dCov_n^{2}(X,Y)}{\sqrt{dCov_n^{2}(X,X)dCov_n^{2}(Y,Y)}}$$
I used the $dCor_n^{2}$ to measure the similarity and it works better than using the Euclidean distance when the matrices are the same size.
References:
Josse, J. and Holmes, S. (2013). Measures of dependence between random vectors and tests
of independence. Literature review. arXiv preprint arXiv:1307.7383.
http://arxiv.org/abs/1307.7383.
Székely G J, Rizzo M L, Bakirov N K. Measuring and testing dependence by correlation of distances[J]. The Annals of Statistics, 2007, 35(6): 2769-2794.
Every diagonalizable matrix $N$ on a finite-dimensional vector space $X$ is normal with respect to some inner product on $X$. Indeed, if $B=\{ b_1,b_2,\cdots,b_n \}$ is a basis for $X$ such that $Nb_k=\lambda_k b_k$, then you can define an inner product $\langle \cdot,\cdot\rangle_{B}$ by
$$
\langle \alpha_1b_1+\cdots+\alpha_n b_n,\beta_1 b_1+\cdots+\beta_n b_n\rangle_{B}=\sum_{j=1}^{n} \alpha_j\beta_j^*,
$$
and you have the adjoint $N^*$ of $N$ with respect to this inner product given by
$$
N^*(\alpha_1 b_1+\cdots+\alpha_n b_n)=\alpha_1^*b_1+\cdots+\alpha_n^*b_n.
$$
So $N^*N=NN^*$ because $N^*,N$ share eigenvectors. Furthermore, the basis of eigenvectors is orthonormal with respect to $\langle\cdot,\cdot\rangle_B$. So, normality is not more or less general than diagonalizability; it's a matter of choosing the right inner product in order to make a diagonalizable $N$ normal.
Best Answer
Total Least Squares technique is what you need. If the points are almost layered into one plane (in any dimension), that method will reveal it and will give you the measure of the noise.
But if the points layer into multiple planes, then you would need more sophisticated methods ranging through SVD , low-rank approximation or other methods.
In general, the problem falls into the category of Point Cloud analysis.