Memory Pre allocation, Dataset Array

3 visualizzazioni (ultimi 30 giorni)
ARS
ARS il 3 Mag 2012
Hi All,
How Can I Pre allocate memory for a DATASET ARRAY which should have 69 Rows and 740 Columns. There is another dataset array of the same size in my workspace. How Can I do NewArray=dataset(size(OldArray)) ?,,,to use size arguments from already built DATASET Array?
Regards,
AMD.

Risposta accettata

Daniel Shub
Daniel Shub il 3 Mag 2012
The dataset class is basically a container holding pointers to other variables/memory locations. Even if you can preallocate the dataset array, I am not sure it will improve performance by much.
  8 Commenti
Daniel Shub
Daniel Shub il 3 Mag 2012
"What does the above error mean" and "how to convert my dataset array to a cell array" are new questions. I will not attempt to answer them in a comment to an answer to an unrelated question. I have never used the dataset class so have little insight into converting it to other classes, or even what it is good for. You will be much better asking these as new questions.
ARS
ARS il 3 Mag 2012
Sorry for the short code. I will post the other questions as new independent questions.

Accedi per commentare.

Più risposte (4)

Oleg Komarov
Oleg Komarov il 3 Mag 2012
One way:
n = 300;
names = arrayfun(@(x) sprintf('C%d',x),1:n,'un',0);
dataset([{zeros(30,n)},names])
Or
dataset(zeros(size(OldArray)));
  2 Commenti
ARS
ARS il 3 Mag 2012
Hi Oleg,
I tried ABC=dataset(zeros(size(oldarray)));
it works good on rows and takes rows(69) but shows only one column as 69x1...while size(oldarray)=69x740
why is it?
Oleg Komarov
Oleg Komarov il 3 Mag 2012
The first solution creates 300 columns (you can set 'n' to be anything) and the second solution is slightly different. Depends what you need to do.
I am not sure but the second might be more efficient.

Accedi per commentare.


jeff wu
jeff wu il 3 Mag 2012
NewArray = zeros(size(Oldarray))
  1 Commento
ARS
ARS il 3 Mag 2012
No Jeff, It creates a numeric matrix of the required size....I am talking about making/Preallocating a "DATASET ARRAY"

Accedi per commentare.


ARS
ARS il 3 Mag 2012
Hi All,
This is for the info of the MATLAB community that when I used dataset array in my loops which had 44340 iterations, the task was completed in 4927 seconds for a rate of about 10 iterations per second. means it took me approx 82 minutes to do that.(My whole day was wasted in running that script 4 times).
After I used a CELL ARRAY in place of the laziest DATASET ARRAY, the same task completed in
Elapsed time is 872.837510 seconds. means just 14 minutes (came down from 82 minutes).
This was amazing...no other changes to the code.
Life is easy with Cell Array.
Regards,
AMD

Peter Perkins
Peter Perkins il 3 Mag 2012
Ahmad, there are several things going on in this thread. Let me try to answer them one by one.
A dataset array can hold just about any type of variable, so just specifying what size you want is not sufficient to create one. As you noted, Oleg's suggestion will create a dataset with a single variable, and Jeff's suggestion will create a numeric array. You need to create a dataset array. I can't say exactly what you need to do to preallocate, because you have not provided any specific information about types. In the simplest case, if you want the new array to have the same data types in each variable as the old array, all you need to do to preallocate is
dsNew = dsOld;
and then just overwrite the existing values with the new ones. There's no particular reason why you would need to start with zeros, or empty strings or whatever, unless you would only be overwriting some elements. But there are ways to do that too. If the new array is to contain different types of data than the old array, there are ways to do that too.
You must be running a version of MATLAB older than R2012a, which is why dataset2cell is not found.
As for your comments about dataset vs. cell:
Elsewhere you've asked a question or two about manipulating data in a dataset array. One of them involved a nested loop with some string comparison. Not sure if your comments here are related to that or not, but it is often possible to avoid the kind of loop you had there by appropriate use of vectorized operations. So in that case, strcmp. Yes, dataset can be much slower than cell for scalar access, but it is often possible to write code that is both faster to write and to run, and easier to read, by using vectorized operations. In this case, I can't say what that would be without more information.
I can't tell what you are doing with your data, but you may ultimately find that a cell array is not the best way to store it. dataset provides a convenient way to wrap mixed types of data into one container, and still be able to access individual variables as their "native" type using dot subscripting. For example, with a dataset array it's easy to say mean(data.Var1). From what I can tell from your other questions, you have strings and numbers. You can put all that in a cell array, but you'll likely end up disappointed if you are storing scalar numeric data that way, because they are too general and lack, for example, the ability to do simple math. That will also not scale well, again because cell arrays are too general -- internally a cell array will store each scalar value as a separate MATLAB array. On the other hand, if what you have is all string data, then yes, a cell array is the right container.
Hope this helps.
  2 Commenti
ARS
ARS il 3 Mag 2012
Thanks Peter, It was very helpful.
I am using MAtlab R2011b.
Yes,I am doing Sting comparisons in two dataset containers through a for(and a nested for) loop with 69 If-elseif conditions in the nested loop.
For string comparisons(Strings from two different Datsets), Dataset Arrays take a lot of time while Placing my data just for string comparison in Cell Array(s) is MUCH faster.
For my purpose, I can use a Cell array in this computation and save time and very soon I export it back to Excel for further working of mine.
I thank you for your comments and would seek guidance in the future as well.
Regards,
AMD.
avantika
avantika il 29 Ago 2013
Hi!
I am trying to convert a dataset to cell array of strings to be able to use the unique command in matlab version 2009b. However when I use the command C = dataset2cell(ds3);
I get the following error message:
??? Undefined variable "dataset2cell" or class "dataset2cell".
Is there any solution to tihs problem in matlab v2009.

Accedi per commentare.

Categorie

Scopri di più su Data Type Conversion in Help Center e File Exchange

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by