Find locations of repeated values?
7 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
So, I have this function that takes a set of data and finds if there are values that repeat for more than 300 seconds in that data set...\
function FindRepetition(TruckVariableName)
setpref('Internet','SMTP_Server','lamb.corning.com');
data1 = (TruckVariableName);
x = length(TruckVariableName);
data = reshape(data1, 1, x);
datarep = ~diff(data) & data(2:x) ~= 0; %binary data -- 1 means repeats, 0 means different, excludes repetitive zeros
%if the difference in the data at each point is zero, and if the data at
%that point isn't itself zero, return true. 2:x means difference array is equal to the length of the data array, matrix dimensions must be the same or &
%cannot be used
datarepstr = num2str(datarep); %convert to string
s = regexprep(datarepstr,' ',''); %remove spaces
[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start
l = cellfun('length',runs); %find the length of each run
y = l > 300;
if any(y) %if any run is longer than 5 minutes, display message
%sendmail('johnsonlj2@corning.com', '2011 KENWORTH ISX15','A data fault has been detected - Prolonged data repetition');
disp('--An error has occurred - Prolonged data repetition.');
disp('Errors occurred at');
end
end
I want to find WHERE those repeated values start in that set of data. I tried disp(find(y));, but that finds the locations of the data set y, which is not the original data set. Anyone know how I can find the locations of data1 where the data repeats for more than 300 seconds?
2 Commenti
Risposta accettata
Cedric
il 15 Lug 2013
Modificato: Cedric
il 15 Lug 2013
I think that you can use two approaches. I'll illustrate with a simple example: say we have the following data
>> data = [7 8 8 8 8 6 6 7 8 7 7 7] ;
and we want to get blocks of repeating values with at least 3 elements.
1. Based on your REGEXP method, you would indeed look for the position of streams of 1's larger than a given value.
>> rep = ~diff(data) % Add other components if needed.
rep =
0 1 1 1 0 1 0 0 0 1 1
>> repStr = sprintf('%d', rep)
repStr =
01110100011
>> start = regexp(repStr, '1{2,}', 'start') % 3 similar values -> 2
start = % repetitions.
2 10
2. Without conversion to string and REGEXP:
>> buffer = [true, diff(data)~=0]
buffer =
1 1 0 0 0 1 0 1 1 1 0 0
>> groupStart = find(buffer)
groupStart =
1 2 6 8 9 10
>> groupId = cumsum(buffer)
groupId =
1 2 2 2 2 3 3 4 5 6 6 6
>> groupSize = accumarray(groupId.', ones(size(groupId))).'
groupSize =
1 4 2 1 1 3
>> start = groupStart(groupSize > 2)
start =
2 10
EDIT: note that the 2nd method is more than 5 times faster than the 1st on large datasets.
3 Commenti
Cedric
il 15 Lug 2013
Modificato: Cedric
il 15 Lug 2013
In your command window, type
doc sprintf
then, in the SPRINTF documentation, look up formatSpec, which describes all the format conversion specifiers. %d is for integer, which means that elements of rep are interpreted as integers and converted to string as such.
Più risposte (1)
Muthu Annamalai
il 15 Lug 2013
Guessing from reading the code, and the comments in the code itself, you are looking for the variable, startindex
[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start
So just add this to your return value from the function, and you should be all set.
Vedere anche
Categorie
Scopri di più su Downloads in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!