Can someone explain to me the difference between SVD of numpy and scipy for Multidimensional arrays (Tensors)?
X = np.random.randn(3,3,3)
S1 = numpy.linalg.svd(X)
S2 = scipy.linalg.svd(X)
The S1 here is a tuple containing U 3x3x3, Sigma 3×3 and Vh 3x3x3. But the S2 line throws an error saying 'expected Matrix'. Thus, I use the reshape option to unfold the tensor and compute the svd using scipy.linalg as follows:
Xreshape = np.reshape(X, (9,3))
S2 = scipy.linalg.svd(X)
Now, S2 here is a tuple containing U 9×9, Sigma 3×1 and Vh 3×3.
The elements of S1 and S2 are not the same. Could someone explain to me the theory behind it?
Best Answer
It's easy to explain what's going on for the scipy svd method: the scipy method does not permit arrays of dimension 3 or higher as an input. If you feed in a matrix, it yields the SVD of that matrix.
The real question, then, is what does the numpy svd method do with the input? The answer is explained in the documentation, but the explanation is a bit terse. In general (for $k \geq 3$), numpy interprets a shape $n_1 \times n_2 \times \cdots \times n_k$ array as an $n_1 \times \cdots \times n_{k-2}$ array of $n_{k-1} \times n_k$ matrices and it computes the svd of each of these matrices separately.
For your particular case, numpy interprets $X$ as a list of $3$ matrices. Let $X_i$ denote the slice
X[i,:,:]
. IfU,s,Vh
denote the components the output ofnumpy.linalg.svd(X)
, then these contain and svd for each matrix $X_0$, $X_1$, and $X_2$. In particular, the matrix productShould be (close to being) equal to $X_i$ for $i = 0,1,2$. You should be able to verify this with
which should return
True
. As the documentation notes, the productU[i,:,:] @ numpy.diag(s[i,:])
can be more efficiently computed asU[i,:,:] * s[i,None,:]
, but this is fairly confusing unless you're comfortable with numpy's broadcasting rules for array multiplication.