Improving the consistency of the NNMF function
3 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I'm attempting to use non-negative matrix factorization on a matrix containing spectral information (A). Whenever I run the nnmf function, the output matrices W and H are usually different from any other iterations. I have found that this is stated in the help documentation for the nnmf function:
"Because the root-mean-squared residual D may have local minima, repeated factorizations may yield different W and H."
However, as a result of this, I find it difficult to make use this method to say anything scientifically meaningful, as it introduces considerable bias on my behalf (I can effectively run the function repeatedly until I come to a result that fits with my narrative).
My question: how can I get the nnmf function to return W and H matrices with higher reproducibility thereby improving my confidence in the method? I've tried tweaking the input options by decreasing the tolerances, increasing the number of replicates in the initial run, and increasing the number of iterations, all with little effect.
My code is currently very similar to what is written in the help documentation and looks like this:
numcom = 2; % The rank. My datasets typically can be described by very low-rank approximations
opt = statset('MaxIter', 10, 'Display', 'final');
[W0,H0] = nnmf(A, numcom, 'replicates', 10, 'options', opt, 'algorithm', 'mult'); %Get starting values
opt = statset('Maxiter', 1000, 'Display', 'final');
[W,H] = nnmf(A, numcom, 'w0', W0, 'h0', H0, 'options', opt,' algorithm', 'als');
Of course, I can set the random number generator to default before running the function every time:
rng('default')
But that kind of defeats the purpose ;)
1 Commento
Risposte (1)
Jakub
il 19 Ago 2019
According to my experiences I only use 'als' algorithm and with many replicates which usually gives me better estimate. So something like this:
opt = statset('Maxiter',100,'Display','final','useparallel',true);
[coeff,score] = nnmf(A, numcom,'replicates',1e6,'options',opt);
1 Commento
Guy Reading
il 12 Nov 2019
Agreed, with enough replicates hopefully the space will be adequately explored and the global max will be found each time & repeatably returned. How many is enough? Depends on how large your input space (m) is...
Vedere anche
Categorie
Scopri di più su Dimensionality Reduction and Feature Extraction in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!