MATLAB: When using NNMF, how are the new start and best start values calculated in MATLAB 8.2 (R2013a)

Statistics and Machine Learning Toolbox

When using the NNMF function I can define a number of replicates with new random values for W and H.
– How are the new start values defined? Are they random values or do they depend on the start values defined in the step before?
– How are the best start values calculated?

Best Answer

The start values are defined on lines 245-254 in nnmf.m (R2013a) :
if( ~isempty(w0) && iter ==1 )
whtry{3} = w0;
else
whtry{3} = rand(S,n,k);
end
if( ~isempty(h0) && iter ==1 )
whtry{4} = h0;
else
whtry{4} = rand(S,k,m);
end
If the user does not pass the initial estimates w0 and h0, they are generated at random. If the user passes the initial estimates, these are used for the first iteration (first replicate); for the following iterations random estimates are used.
By default the 'replicates' parameter is set to 1. In that case, NNMF attempts only one factorization, that is, one iteration is used.
Note also the special case on lines 128-136. Below, a and k are the first two inputs to NNMF, and [n,m] = size(a).
if isempty(w0) && isempty(h0)
if k==m
w0 = a;
h0 = eye(k);
elseif k==n
w0 = eye(k);
h0 = a;
end
end
The best factorization is chosen out of attempted trials by minimizing the residual. Quoting the doc:
" The factors W and H are chosen to minimize the root-mean-squared residual D between A and W*H: D = sqrt(norm(A-W*H,'fro')/(N*M)) "