Problem running a cvpartition with a tall array
1 visualizzazione (ultimi 30 giorni)
Mostra commenti meno recenti
MATLAB's documentation indicates that cvpartition function is support for tall arrays, as long as it uses a stratified holdout partition. Therefore, this should work, when "group" is a Mx1 double column vector pulled out of a datastore:
myPartition = cvpartition(group, 'Holdout',.25);
A = gather(test(myPartition));
It yields the proper logical vector if I load the "group" array into memory. But as a tall array, I instead get this error:
Error using internal.stats.bigdata.cvpartitionTallImpl (line 92)
P is too small to have a non-empty test set.
There gather operation is not the issue here; this is the first command applied to the tall array after it is created.
I think I've tracked down the cause of that error to a this bit of code in the cvpartitionInMemoryImpl class:
if (isempty(cv.Group) && floor(cv.N *T) == 0) ||...
(~isempty(cv.Group) && floor(length(cv.Group) * T) == 0)
error(message('stats:cvpartition:PTooSmall'));
end
Where T is (at least supposed to be) the 0.25 probability value.
Is there a way around this error? I'm working with some very, very large data files and would like to take advantage of the tall array functionality wherever possible.
0 Commenti
Risposte (0)
Vedere anche
Categorie
Scopri di più su Model Building and Assessment in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!