SVD in scipy and numpy for tensors

pythonsvdtensor decompositiontensor-ranktensors

Can someone explain to me the difference between SVD of numpy and scipy for Multidimensional arrays (Tensors)?

X = np.random.randn(3,3,3)
S1 = numpy.linalg.svd(X)
S2 = scipy.linalg.svd(X)

The S1 here is a tuple containing U 3x3x3, Sigma 3×3 and Vh 3x3x3. But the S2 line throws an error saying 'expected Matrix'. Thus, I use the reshape option to unfold the tensor and compute the svd using scipy.linalg as follows:

Xreshape = np.reshape(X, (9,3))
S2 = scipy.linalg.svd(X)

Now, S2 here is a tuple containing U 9×9, Sigma 3×1 and Vh 3×3.
The elements of S1 and S2 are not the same. Could someone explain to me the theory behind it?

Best Answer

It's easy to explain what's going on for the scipy svd method: the scipy method does not permit arrays of dimension 3 or higher as an input. If you feed in a matrix, it yields the SVD of that matrix.

The real question, then, is what does the numpy svd method do with the input? The answer is explained in the documentation, but the explanation is a bit terse. In general (for $k \geq 3$), numpy interprets a shape $n_1 \times n_2 \times \cdots \times n_k$ array as an $n_1 \times \cdots \times n_{k-2}$ array of $n_{k-1} \times n_k$ matrices and it computes the svd of each of these matrices separately.

For your particular case, numpy interprets $X$ as a list of $3$ matrices. Let $X_i$ denote the slice X[i,:,:]. If U,s,Vh denote the components the output of numpy.linalg.svd(X), then these contain and svd for each matrix $X_0$, $X_1$, and $X_2$. In particular, the matrix product

U[i,:,:] @ numpy.diag(s[i,:]) @ Vh[i,:,:]

Should be (close to being) equal to $X_i$ for $i = 0,1,2$. You should be able to verify this with

numpy.allclose(X[i,:,:],U[i,:,:] @ numpy.diag(s[i,:]) @ Vh[i,:,:])

which should return True. As the documentation notes, the product U[i,:,:] @ numpy.diag(s[i,:]) can be more efficiently computed as U[i,:,:] * s[i,None,:], but this is fairly confusing unless you're comfortable with numpy's broadcasting rules for array multiplication.

Related Solutions

[Math] Trying to Check Cov Matrix calculation from SVD using Numpy

Three fixes are necessary:

(1) Subtract off the variable means before performing the SVD:

x = x - x.mean(axis=0)

(2) In the call U,s,V = np.linalg.svd(x), the V that is returned is already what you call $W^T$. So to obtain $X^TX=W\Sigma^2W^T$ you need to perform the matrix multiplication in the other order:

C = np.dot(np.dot(V.T,np.diag(s**2)),V)

(3) Normalize the result by $n-1$:

C = C / (x.shape[0] - 1)

Here's the updated function:

import numpy as np

def mycov(x):
    x = x - x.mean(axis=0)
    U, s, V = np.linalg.svd(x, full_matrices = 0)
    C = np.dot(np.dot(V.T,np.diag(s**2)),V)
    return C / (x.shape[0] - 1)

Test run:

x = np.vstack([np.random.randn(1000), np.random.randn(1000), np.random.randn(1000)]).T
myout = cov(x)
pyout = np.cov(x.T)
print myout
print "---"
print pyout
print "---"
print np.allclose(myout, pyout)

Output shows agreement with np.cov(x.T):

[[ 1.04987565 -0.01848885  0.01795246]
 [-0.01848885  1.01096332 -0.00429439]
 [ 0.01795246 -0.00429439  1.01202886]]
---
[[ 1.04987565 -0.01848885  0.01795246]
 [-0.01848885  1.01096332 -0.00429439]
 [ 0.01795246 -0.00429439  1.01202886]]
---
True

[Math] Introductory questions about tensors

I think putting tensors in the right context will clear much of this up. I'll stick with $\mathbb R^3$ since that's the example you use. Rank $2$ tensors are elements of the so-called tensor product of $\mathbb R^3$ with itself, which is denoted by $\mathbb R^3 \otimes \mathbb R^3$. This space consists of all linear combinations of expressions of the form $u \otimes v$ under the stipulations that: $$u\otimes(v + w) = u\otimes v + u\otimes w,$$ $$(u+v)\otimes w = u\otimes w + v\otimes w, \text{and}$$ $$u\otimes (cv) = c(u \otimes v) = (cu) \otimes v$$ where $u,v,w$ are vectors and $c,d$ are scalars.

Taking the standard basis $e_1,e_2,e_3$ of $\mathbb R^3$, any rank $2$ tensor can then be written as a linear combination of the $9$ "pure" tensors $e_i \otimes e_j$ for $i,j = 1,2,3$. The $9$ scalars you take as coefficients in such a linear combination make up the $3 \times 3$ matrix which "represents" that rank $2$ tensor. A rank $3$ tensor, an element of the tensor product $\mathbb R^3 \otimes \mathbb R^3 \otimes \mathbb R^3$, would then consists of linear combinations of the $27$ pure tensors: $$e_i \otimes e_j \otimes e_k$$ where $i,j,k=1,2,3$. The $27$ coefficients in such a linear combination make up the $3 \times 3 \times 3$ array you mention.

Tensor multiplication is then just given by the good ol' distributive property. For instance, the product of the rank $1$ tensor $2e_1+ 3e_2$ and the rank $2$ tensor $-2(e_1 \otimes e_2) + 2(e_2 \otimes e_3)$ is: $$[2e_1 + 3e_2] \otimes [-2(e_1 \otimes e_2) + 2(e_2 \otimes e_3)]$$ $$-4(e_1 \otimes e_1 \otimes e_2)+4(e_1\otimes e_2 \otimes e_2)-6(e_2 \otimes e_1\otimes e_2)+6(e_2 \otimes e_2\otimes e_3).$$ For two rank $1$ tensors $$ae_1+be_2+ce_3 \text{ and } xe_1+ye_2+ze_3,$$ tensor multiplication gives a rank $2$ tensor whose coefficient matrix (i.e. the matrix whose entries are the coefficients of the $e_i \otimes e_j$ terms) is the product of the matrices $$\begin{pmatrix}a\\b\\c\end{pmatrix} \text{ and } \begin{pmatrix}x&y&z\end{pmatrix},$$ as you alluded to in your question. However, in general there is no simple relation between tensor multiplication and matrix multiplication.

Best Answer

Related Solutions

[Math] Trying to Check Cov Matrix calculation from SVD using Numpy

[Math] Introductory questions about tensors

Related Question