Building tall table from tall arrays generates error
3 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
clear
dataFile = 'data.csv';
ds = tabularTextDatastore(dataFile, FileExtensions='.csv');
ds.ReadVariableNames = true;
ds.Delimiter = ',';
ds.SelectedVariableNames = ["hash", "count"];
ds.SelectedFormats = {'%s', '%f'};
data = tall(ds);
[g, THash] = findgroups(data.hash);
TCount = splitapply(@(x) {x}, data.count, g);
%% This works but cannot use it because actual data file is far larger than memory
hash = gather(THash);
count = gather(TCount);
T1 = table(hash, count);
%% This is the intended code but doesn't work
TT = table(THash,TCount);
write(fullfile(pwd,'data'),TT,FileType="parquet");
0 Commenti
Risposte (1)
Oguz Kaan Hancioglu
il 15 Mar 2023
Your code wasn't work because "gather(TCount)" returns cell array for each element. Therefore you are trying to write double array in to one single cell. You can find the length of each array into the cell. I hope this solves your problem.
%% This works but cannot use it because actual data file is far larger than memory
hash = gather(THash);
count = gather(TCount);
cellsz = cellfun(@size,count,'uni',false);
newCount = cellfun(@(x) x(1),cellsz,'UniformOutput',false)
T1 = table(hash, newCount);
Vedere anche
Categorie
Scopri di più su Matrices and Arrays in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!