Find locations of repeated values?

Question

Jacqueline il 15 Lug 2013

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values

So, I have this function that takes a set of data and finds if there are values that repeat for more than 300 seconds in that data set...\

function FindRepetition(TruckVariableName)

setpref('Internet','SMTP_Server','lamb.corning.com');

data1 = (TruckVariableName);
x = length(TruckVariableName);
data = reshape(data1, 1, x); 
datarep = ~diff(data) & data(2:x) ~= 0; %binary data -- 1 means repeats, 0 means different, excludes repetitive zeros
%if the difference in the data at each point is zero, and if the data at
%that point isn't itself zero, return true. 2:x means difference array is equal to the length of the data array, matrix dimensions must be the same or &
%cannot be used
datarepstr = num2str(datarep); %convert to string
s = regexprep(datarepstr,' ',''); %remove spaces
[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start
l = cellfun('length',runs); %find the length of each run
y = l > 300;
if any(y) %if any run is longer than 5 minutes, display message
  %sendmail('johnsonlj2@corning.com', '2011 KENWORTH ISX15','A data fault has been detected - Prolonged data repetition');
  disp('--An error has occurred - Prolonged data repetition.');
  disp('Errors occurred at'); 
end
end

I want to find WHERE those repeated values start in that set of data. I tried disp(find(y));, but that finds the locations of the data set y, which is not the original data set. Anyone know how I can find the locations of data1 where the data repeats for more than 300 seconds?

2 Commenti
Mostra NessunoNascondi Nessuno

Cedric il 15 Lug 2013

Modificato: Cedric il 15 Lug 2013

Could you provide a sample dataset or the content of this TruckVariableName that you pass to your function?

Jacqueline il 15 Lug 2013

One of my variables is engine speed, and the data is collected for over 95,000 seconds. A chunk of the data may look like this...

1055.25000000000 777.250000000000 771.750000000000 1112.37500000000 1151.37500000000 1447 1447 1447 1447 1447 1447 1447 1447 668.625000000000 803.750000000000 850.250000000000 693.625000000000 1069.37500000000 868.500000000000 985.875000000000 1085.87500000000 1148 1065.62500000000 978.250000000000 885.750000000000 723.125000000000 638.125000000000 678.500000000000 807.500000000000 692.750000000000 814.875000000000

See how 1447 is repeated? Say that was repeating for more than 300 seconds. My script would use the ~diff function and replace the non-repeating numbers with 0s and the repeating numbers with 1s. Then it finds were the ones repeat for more than 300 seconds. When I use find(y) though, it finds locations but they don't correspond to the original data set

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Cedric il 15 Lug 2013

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values#answer_91801

Modificato: Cedric il 15 Lug 2013

Apri in MATLAB Online

I think that you can use two approaches. I'll illustrate with a simple example: say we have the following data

>> data = [7 8 8 8 8 6 6 7 8 7 7 7] ;

and we want to get blocks of repeating values with at least 3 elements.

1. Based on your REGEXP method, you would indeed look for the position of streams of 1's larger than a given value.

 >> rep = ~diff(data)                            % Add other components if needed.
 rep =
     0     1     1     1     0     1     0     0     0     1     1
 >> repStr = sprintf('%d', rep)
 repStr =
     01110100011
 >> start = regexp(repStr, '1{2,}', 'start')     % 3 similar values -> 2 
 start =                                         % repetitions.
     2    10

2. Without conversion to string and REGEXP:

 >> buffer = [true, diff(data)~=0]
 buffer =
     1     1     0     0     0     1     0     1     1     1     0     0
 >> groupStart = find(buffer)
 groupStart =
     1     2     6     8     9    10
 >> groupId = cumsum(buffer)
 groupId =
     1     2     2     2     2     3     3     4     5     6     6     6
 >> groupSize = accumarray(groupId.', ones(size(groupId))).'
 groupSize =
     1     4     2     1     1     3
 >> start = groupStart(groupSize > 2)
 start =
     2    10

EDIT: note that the 2nd method is more than 5 times faster than the 1st on large datasets.

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Cedric il 15 Lug 2013

Modificato: Cedric il 15 Lug 2013

Apri in MATLAB Online

In your command window, type

doc sprintf

then, in the SPRINTF documentation, look up formatSpec, which describes all the format conversion specifiers. %d is for integer, which means that elements of rep are interpreted as integers and converted to string as such.

Jacqueline il 15 Lug 2013

Thank you!

Accedi per commentare.

Answer 2

Muthu Annamalai il 15 Lug 2013

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values#answer_91795

Apri in MATLAB Online

Guessing from reading the code, and the comments in the code itself, you are looking for the variable, startindex

[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start

So just add this to your return value from the function, and you should be all set.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Jacqueline il 15 Lug 2013

That finds the starting point of where there are more than one 1s in a data set of 1s and zeros. The length of that string is different than my original string, which is where I need to find the locations of the repeating values

Accedi per commentare.

Find locations of repeated values?

2 Commenti
Mostra NessunoNascondi Nessuno

Risposta accettata

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Più risposte (1)

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

Find locations of repeated values?

2 Commenti Mostra NessunoNascondi Nessuno

Risposta accettata

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

Più risposte (1)

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti