How do I get the 'spectralcluster' function to work with out of memory data?

5 visualizzazioni (ultimi 30 giorni)
I have a large dataset of observations which I would like to cluster using the function spectralcluster. However, when I do so, I get an out of memory error -
[~,V_temp,D_temp] = spectralcluster(ChosenPatch{2,1}.data,5);
Error using <=
Requested 128424x128424 (138.2GB) array exceeds maximum array size preference (94.6GB). This might cause MATLAB to become unresponsive.
Error in internal.stats.spectraleigs>setSubnormalNumsToZero (line 196)
isTooSmall = abs(L) <= N*eps(max(L,[],'all')) & (abs(L)>0);
Error in internal.stats.spectraleigs>shimalik (line 141)
L = setSubnormalNumsToZero(L);
Error in internal.stats.spectraleigs (line 82)
[V,D,L] = shimalik(S,k,sv);
Error in spectralcluster (line 176)
[Vnonan,D] = internal.stats.spectraleigs(X,k,lapNorm);
Related documentation
I then tried using the distributed function but got another error -
dt=distributed(ChosenPatch{2,1}.data);
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 12).
spmd
[~,V_temp,D_temp] = spectralcluster(dt,5);
end
Worker 1:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 2:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 3:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 4:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 5:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 6:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 7:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 8:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 9:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 10:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 11:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Worker 12:
Warning: Converting input data to double.
> In pdist2 (line 253)
In ExhaustiveSearcher/knnsearch (line 235)
In knnsearch (line 154)
In internal.stats.similarity (line 211)
In spectralcluster (line 167)
In spmdlang.remoteBlockExecutionPlain (line 49)
In spmdlang.remoteBlockExecution (line 15)
Error detected on workers 1 2 3 4 5 6 7 8 9 10 11 12.
Caused by:
Error using pdist2 (line 376)
X and Y inputs to PDIST2MEX must both be double, or both be single.
Finally, I tried a tall array, but that failed as well -
tt=tall(ChosenPatch{2,1}.data);
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 12).
>> [~,V_temp,D_temp] = spectralcluster(tt,5);
Error using tall/validateattributes
validateattributes does not support tall arrays.
Error in spectralcluster (line 126)
validateattributes(X,{'single','double'},{'2d','real','nonempty'},'','X');
Is there any other solution I could try?
Thank you,
Guy
  7 Commenti
Walter Roberson
Walter Roberson il 24 Mar 2023
kdtree is the general name for such structures, and it is implemented by knnsearch from the Statistics Toolbox. But possibly you will need to use different options for your spectral cluster call.
Guy Nir
Guy Nir il 24 Mar 2023
I see. I just misunderstood you earler. You mean to use knnsearch instead of the spectral clustering. Knnseearch use the 'kdtree' method only for four distance metcs, including 'eucleadian'. However, at least for the spectral clustering I've noticed that 'spearman' or 'seucledian' works better for my data. Unfortunately, that will require the 'exhaustive' approach. But, I could still try.
Is there a way that you think spectral clustering could work?
Thank you,
Guy

Accedi per commentare.

Risposte (0)

Prodotti


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by