MATLAB: When using NNMF, how are the new start and best start values calculated in MATLAB 8.2 (R2013a)

Statistics and Machine Learning Toolbox

When using the NNMF function I can define a number of replicates with new random values for W and H.

– How are the new start values defined? Are they random values or do they depend on the start values defined in the step before?

– How are the best start values calculated?

Best Answer

The start values are defined on lines 245-254 in nnmf.m (R2013a) :

         if( ~isempty(w0) && iter ==1 ) 
             whtry{3} = w0;
         else
             whtry{3} = rand(S,n,k); 
         end
         if( ~isempty(h0) && iter ==1 )
             whtry{4} = h0;
         else
             whtry{4} = rand(S,k,m); 
         end

If the user does not pass the initial estimates w0 and h0, they are generated at random. If the user passes the initial estimates, these are used for the first iteration (first replicate); for the following iterations random estimates are used.

By default the 'replicates' parameter is set to 1. In that case, NNMF attempts only one factorization, that is, one iteration is used.

Note also the special case on lines 128-136. Below, a and k are the first two inputs to NNMF, and [n,m] = size(a).

if isempty(w0) && isempty(h0)
     if k==m
         w0 = a;
         h0 = eye(k);
     elseif k==n
         w0 = eye(k);
         h0 = a;
     end
end

The best factorization is chosen out of attempted trials by minimizing the residual. Quoting the doc:

" The factors W and H are chosen to minimize the root-mean-squared residual D between A and W*H: D = sqrt(norm(A-W*H,'fro')/(N*M)) "

Related Solutions

MATLAB: Nnmf function usage problem

Most Statistics Toolbox functions aren't written to operate on integer data types. Try nnmf(double(resim),100) to convert to double precision. You may be able to use single() instead, if double() needs too much memory.

MATLAB: Error using * MTIMES is not fully supported for integer classes. At least one input must be scalar

You use

for W0 = rand (m,r);

for is defined as proceeding along the columns of whatever is on the right hand side, so W0 will be set to columns of length m x 1, and that will happen r times.

However, it turns out that you never use W0 in your code, so the overall effect is as if you had just set up a loop to happen r times.

On the next line you have

H0 = h0.* (V*w0')./((w0'*w0)*h0 + (10^-9));

This makes use of a bunch of variables that we as readers do not know the definition of. Your variable w0 there is not the same as W0 because MATLAB is case sensitive.

We know that V is either 2 or 3 dimensional, and is an integer data type, as it was read in by imread(). You use it with a * operation with w0 . We can tell by context that w0 is intended to be a vector or a 2D array. If w0 is not a scalar, then you have a integer data class * a vector or array. But the * operation is not fully implemented for integer data classes: if you have an integer data class, the only permitted operations are:

integer scalar * integer scalar
integer scalar * integer array
integer scalar * double scalar
integer scalar * double array
integer array * integer scalar
integer array * double scalar
double scalar * integer scalar
double scalar * integer array
double array * integer scalar

or to put it another way, when one side of the * is an integer array (non-scalar), then the other side must be a scalar integer or scalar double, and it is not permitted to have a non-scalar integer array on one side and a non-scalar array of any kind on the other side.

The easiest fix:

V = double( imread('circuit.tif') );

Best Answer

Related Solutions

MATLAB: Nnmf function usage problem

MATLAB: Error using * MTIMES is not fully supported for integer classes. At least one input must be scalar

Related Question