Identify Duplicate values in an array and replace with Nan

9 visualizzazioni (ultimi 30 giorni)
Hello: I have a array as below of dimension 40,000x3. Third columns often contain duplicate values. I need to identify duplicate values and replace with Nan. Kindly help.
2.39300000000000 0 6.16800000000000
2.38720000000000 0 6.16800000000000
2.38480000000000 0 6.16800000000000
2.37380000000000 0 6.16800000000000
2.37410000000000 0 6.16800000000000
2.37020000000000 0 6.16800000000000
2.36880000000000 0 6.16800000000000
2.36350000000000 0 6.16800000000000

Risposta accettata

Mrutyunjaya Hiremath
Mrutyunjaya Hiremath il 7 Set 2023
If you want to replace only the duplicates with 'NaN' and keep one occurrence of each value intact, here is the code:
% Sample array (Replace this with your 40,000x3 array)
array = [
2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780;
];
% Extract the third column
third_col = array(:, 3);
% Find unique values and their first occurrence index
[unique_vals, ~, ic] = unique(third_col);
% Count the occurrence of each unique value
counts = accumarray(ic, 1);
% Identify values that occur more than once (duplicates)
duplicate_vals = unique_vals(counts > 1);
% Replace only duplicates with NaN, keep one occurrence of each value
for val = duplicate_vals'
idx = find(third_col == val);
third_col(idx(2:end)) = NaN; % Keep the first occurrence, replace the rest with NaN
end
% Update the third column in the original array
array(:, 3) = third_col;
% Display the updated array
disp(array);
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN
In this code, for each duplicate value, find its indices in the third column using 'find'. Then, keep the first occurrence (index idx(1)) and replace the rest (idx(2:end)) with NaN. This will leave one instance of each value in the third column and replace only the duplicates with 'NaN'.

Più risposte (1)

Dyuman Joshi
Dyuman Joshi il 7 Set 2023
Modificato: Dyuman Joshi il 7 Set 2023
Here's a much faster and simpler approach -
array = [2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780];
%Get the unique values and the indices corresponding to their 1st occurence
%in order they appear in the array
[val,first_idx] = unique(array(:,3),'stable')
val = 3×1
6.1680 6.1780 6.1690
first_idx = 3×1
1 3 5
%Convert all the values of column 3 to NaN
array(:,3) = NaN;
%Re-assign the values according to the indices
array(first_idx,3) = val
array = 8×3
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN

Categorie

Scopri di più su Matrices and Arrays in Help Center e File Exchange

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by