[Math] How to project a vector onto a very large, non-orthogonal subspace

linear algebrana.numerical-analysis

I have a difficult problem.

I have a very large, non-orthogonal matrix $A$ and need to project the vector $y$ onto the subspace spanning the columns of $A$. If this were a small matrix, I would use Gram-Schmidt or just compute $A(A^TA)^{-1}A^Ty$. Unfortunately, $A$ is just too big to do this in a feasible amount of time. Is there a good trick to do this projection?

EDIT2: Some background info: I work in signal processing and work with Fusion Frames quite a bit. For those unfamiliar, it is the problem of representing a signal with a union of subspaces from an overcomplete dictionary, typically using a block-sparse solver (in my case, a program called SPGL1). The problem is, I need to project my signal vector onto this dictionary before performing the inversion. The dictionary is coherent so under normal circumstances I need to use the formula above or a method like Gram-Schmidt to get an orthonormal basis.

Right now, my working solution is to approximate this projection by using the truncated SVD and using the $U$ matrix as an approximate orthonormal basis (a la PCA). But, it would be nice if anyone knew any iterative projection algorithms. I know this isn't a scientific computing site, but I figured there may be someone here who knows some linear algebra tricks.

Best Answer

Depending on how many rows, maybe you could use some statistics results.

When I was doing Principal Component Analysis I ended up having a matrix $X_{75\times 375000}$ and I could still find a good vector space that described well these vectors (and projections).

That's the whole point of Principal Component Analysis. Since the amount of information you have is just too much, you can get rid of some redundancy and reduce the dimension (really, my case was a 75 dimension space with that huge matrix).

Maybe these articles can give you the basic idea. http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf http://www.stat.cmu.edu/~cshalizi/490/pca/pca-handout.pdf

It's a well known algorithm, mainly for computer vision (Eigenfaces). Maybe it won't be useful at all, since you want the span of these vectors and this is an approximation method. Anyway, I always consider this algorithm when some gigantic vectors need to be analyzed.

Related Question