Using unique fuction on cell array
Mostra commenti meno recenti

Hello,
This is following up on a previous post adding two cells together. http://www.mathworks.com/matlabcentral/answers/263633-combining-to-two-cells
The reason I this is because I couldn't use the unique function unless it was a cell array of string. So I changed the one column. However, I can't apply this to the whole table and I want to use the function X=unique(X,'stable'). So in the picture attached it would remove second of the two highlighted rows. The unique function doesn't work as there is a mix of cell types.
ALTERNATIVE:
Focus on three columns where one is a mixture of string/number, number and date (first three columns in the sample excel sheet).
For the string/number I want to use something similar to numstr(). From my previous question I probably create another column using the for loop
for n = 1:length(p)
A(n) = {[num2str(p{n,1}),k{n,1}]};
end
and then for the number I would use the suggestion from dpb using cellmat. For dates unique works fine.
Putting them together I would find the unique indices
Thanks, Stephan
9 Commenti
Matt J
il 13 Gen 2016
Your attachment didn't make it. Make sure you confirm the file selection.
Stephan Richtering
il 13 Gen 2016
Adam
il 13 Gen 2016
Is there a reason why column 1 contains strings? Are there any values in there that aren't numeric? If not then just convert column 1 to doubles or singles and use a double/single array for the whole lot.
Otherwise I guess you will have to convert the numeric elements of your cell array to strings in order to use the 'unique' function.
dpb
il 13 Gen 2016
Outline the goal again more succinctly and what the actual data are before you go munging around willy-nilly changing numerics to string representations, etc. With the data just as an image can't do anything with it, but why are there multiple values in some cells and not others--is that real or a fignewton of the conversion to character instead of numeric. If those are some sort of a compound component ID or somesuch, then indeed you can't use a numeric representation and unique as you've got what would be arrays versus single values to do the comparison over.
OTOH, if the idea is to remove the rows that have duplicate values in the first column, then simply use the alternate return from unique -- see the Answer for that solution.
There's nothing keeping you from using unique on numeric data albeit there's always the issue of floating point comparison for noninteger values.
I'm unclear why unique does not work here:
Name = {'Fred';'Betty';'Betty';'Bob';'George';'Jane'};
[C,ia,ic] = unique(Name,'stable');
Name(ia);
yields
'Fred'
'Betty'
'Bob'
'George'
'Jane'
as desired.
Stephan Richtering
il 14 Gen 2016
It looks like poor data design is making things more complicated too.
In Excel everything is stuck in one table... but MATLAB is not Excel. Regardless of this many beginners stick numeric (or mixed) data into cell arrays, without realizing that they should stick to keeping data in the simplest array possible to minimize processing complications: this means numeric data in numeric arrays, and strings in cell arrays (or char).
If you search this forum you will find lots of beginners attempting to manipulate numeric values inside cell arrays. The usual solution is to remove the values from the cell array and perform the desired operation. The optimal solution is that they should not have been in cell arrays in the first place.
Perhaps the data structure should be revised to reflect the data types that it contains, and the flow of the algorithm.
Stephan Richtering
il 14 Gen 2016
Stephan Richtering
il 14 Gen 2016
Modificato: Stephan Richtering
il 14 Gen 2016
Risposte (1)
dpb
il 13 Gen 2016
>> ccc % a sample cell array similar to shown...
ccc =
'13,14' [10700]
'13,14' [ 0]
'123' [ 200]
'123' [ 200]
>> [~,ia]=unique(cell2mat(ccc(:,2)),'stable') % get the unique indices from the 2nd column
ia =
1
2
3
>> ccc(ia,:) % show the result
ans =
'13,14' [10700]
'13,14' [ 0]
'123' [ 200]
>>
To pare the table simply reassign --
ccc=ccc(ia,:);
4 Commenti
Stephan Richtering
il 13 Gen 2016
dpb
il 13 Gen 2016
Well, as described in my earlier comment we need to know the precise data structure and what's real vs what's a figment of your having converted from the original form to try to make something work rather than being the actual data form.
Specifically, again, what is the deal on the first column as shown and what is the real underlying problem to be solved? Is it the redundant values as shown in the one column above, the duplicate ID in the first column (which, if so, there would seem to be as far as the amount of data shown an issue in the first two rows as well) or what, precisely? We can't solve a problem that isn't formulated.
Also, attach a short section of an actual data file, not the image. It doesn't have to be large in either dimension to illustrate with but must represent the various constraints and conditions that are to be handled. Posting the desired result along with it always is a plus.
Stephan Richtering
il 14 Gen 2016
dpb
il 14 Gen 2016
I believe internally for that operation Excel does the comparison to each column individually behind the scenes and then combines those logical results. If your data really are so ill-formed as you say and you can't (or won't???) clean it up in the process of importing it to make it more manageable, then I'd posit the above is the only option you've left yourself.
Categorie
Scopri di più su Data Type Identification in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!