Remove duplicate rows in CSV file

20 visualizzazioni (ultimi 30 giorni)
hello dear mathworkers,
I have a dataset consist of approximatlly 4 millions records, and i want to remove the duplicated rows or records, can any one help me with the way, i am using matlab 2018a . thanks in advance
  7 Commenti
madhan ravi
madhan ravi il 24 Lug 2019
Mohammed: Alex's solution should have solved your problem.
mohammad Alsajri
mohammad Alsajri il 25 Lug 2019
thanks for help guys

Accedi per commentare.

Risposta accettata

Alex Mcaulley
Alex Mcaulley il 23 Lug 2019
Since all is numeric data, you can use:
data = xlsread('kdd.xlsx');
datanew = unique(data,'rows');
  2 Commenti
Shameer Parmar
Shameer Parmar il 23 Lug 2019
This is not working, because non of data is similar.. I dont find duplicate entries in this sheet provided by Mohammad Alsajri..
using your command, the 'data' and 'datanew' both are getting exact same..
Alex Mcaulley
Alex Mcaulley il 23 Lug 2019
This code works!
I guess the excel provided by Mohammad is just a small portion of the dataset (4 million of rows).

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Language Fundamentals in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by