Filter data into different Phases using multiple conditions.

Hi all,
I have got data which looks like C=[0,3 0,8 0,3 1,3 0,6 1,1 3,1 5,1 3,1 0,5]. I want to split this data into 2 Phases. Phase 1 will start when >1 (condition1) AND 3 spots after this point it must be > 3(condition2). Phase 1 will end when <1. D should eventually look like [0 0 0 0 0 1 1 1 1 0].
Condition 1 can easily be tested by D=C>1. But I can't figure out how to expand the filter by condition 2? How can I achieve this?
Many Thanks, Daan

 Risposta accettata

arich82
arich82 il 23 Set 2015
Modificato: arich82 il 24 Set 2015
I think you can achieve this by a variation on run length encoding. First, encode the data based on condition 1, then use condition 2 to modify the decoding. (Note that I'm interpretting condition 2 to mean '**still** greater than 1' for three spots; let me know if this is correct).
% data
C=[0.3 0.8 0.3 1.3 0.6 1.1 3.1 5.1 3.1 0.5];
% make data nontrivial
C = [C, C]
% apply condition 1
L = C > 1
% get index of the start of each phase change
%
% note: the first index is always the start of the first phase, so
% we begin the mask with true
mask = [true, logical(diff(L))]
idx = find(mask)
% compute the run-length of each phase
%
% note: an 'extra' index is appended to the end of the data,
% (essentially a 'phantom' phase starting past the end of the data)
% in order to get the correct length of the last phase
rl = diff([idx, numel(C)+1])
% extract the values associated with the run-length encoding
v = L(mask)
% apply condition 2
% i.e. require all '1' phases to also have a run-length > 3
v = v & (rl > 3)
% decode the rle
D = v(cumsum(mask))
The input C and output D are (printed columnwise for easier comparison)
>> [num2str(C(:)), repmat(' ', numel(C), 1), num2str(D(:))]
ans =
0.3 0
0.8 0
0.3 0
1.3 0
0.6 0
1.1 1
3.1 1
5.1 1
3.1 1
0.5 0
0.3 0
0.8 0
0.3 0
1.3 0
0.6 0
1.1 1
3.1 1
5.1 1
3.1 1
0.5 0
Please accept this answer if it helps, or leave a note in the comments if I've missed something.

7 Commenti

Daan
Daan il 24 Set 2015
Modificato: Daan il 24 Set 2015
Thank you a lot! I like the idea of spotting rows of 1's. There is only one problem. sometimes there are like big rows of values between 1 and 2 which will also count as phase 1 but actually are phase 2. To visualize it I made a small drawing. So spotting rows of 1's isn't going to work. Have you got another solution for my problem? *EDIT I have inserted a plot of the data in my original question.
I don't understand what you mean by "spotting rows of 1's"; the code above doesn't do anything like that.
If there are "big rows of values between 1 and 2", then these are certainly greater than one (and so are the values "3 spots after"), so by your stated definitions, these must be phase one from what I've understood; if you want to them to be phase two, you need to change your definition, or clarify.
Perhaps if you posted the data (either attach a .mat, .txt, or even a .fig), it would be helpful. (I can't extract the data from either picture you posted.)
I'll check back periodically today.
My mistake. Somehow I misread the definition of Phase 1 to mean "three consecutive greater-than-ones" instead of "greater-than-one followed by greather-than-three, three spots later".
[edit]
% so,
[0.5, 1.1, 1.5, 2.5, 5, 5, 0.9]
% would give
[0, 1, 1, 1, 1, 1, 0]
% but
[0.5, 1.1, 1.1, 1.1, 1.1, 1.1, 0.9]
% would give
[0, 0, 0, 0, 0, 0, 0]
Is this the correct interpretation?
What about [0.5, 1.1, 0, 0, 3.5, 5, 0.9]? Does the value have to stay above one before it reaches three, or can it dip and recover?
Assuming the values can't dip and recover, here is a very, very ugly hack to fix the code in my original post. Simply replace the very last line of the previous code with the following:
% fix phase 1
thresh1 = 1;
thresh2 = 3;
nspots = 3;
for k = 1:numel(v)-1
if v(k) == false
% do nothing
else
% check candidate phase 1 to see if thresh2 is ever met
ind = find(C(idx(k):idx(k+1)) > thresh2, 1, 'first');
if isempty(ind)
% thresh2 failed; set phase to 0
v(k) = false;
else
% thresh2 passed;
% walk starting index forward to nspots before ind (if needed)
ind = max(0, (ind - 1) - nspots);
% update starting index
idx(k) = idx(k) + ind;
% adjust run-length of both current and previous phase
rl(k - 1) = rl(k - 1) + ind;
rl(k) = rl(k) - ind;
end % if ind
end % if v
end % for k
% decode rle
mask2 = false(1, numel(C));
mask2(1) = true;
mask2(1 + cumsum(rl(1:end-1))) = true;
D = v(cumsum(mask2));
disp([num2str(C(:)), repmat(' ', numel(C), 1), num2str(D(:))]);
I might try to clean it up (or post something better, if I'm feeling clever), but this should at least get you working.
Let me know if there are any concerns.
[editted to fix typo mentioned in the comment below]
Incidentally, a very straightforward for-loop might make more sense. I think the entire process is captured in the following short script:
thresh1 = 1;
thresh2 = 3;
nspots = 3;
n = numel(C);
D = false(size(C));
is_phase1 = false;
for k = 1:n-nspots
if C(k) > thresh1
% passed thresh1,
% so check condition 2
if is_phase1
% already in a segment of phase1,
% so no further checking is necessary
D(k) = true;
elseif all(C(k:k+nspots) > thresh1) & (C(k+nspots) > thresh2)
% start of new phase1 segment
is_phase1 = true;
D(k) = true;
else
% not phase_1
% do nothing
end
else
% doesn't pass thresh1,
% so definitely not phase 1
is_phase1 = false;
end
end % for k
D(n - nspots + 1:n) = (C(n - nspots + 1:n) > thresh1) & is_phase1;
disp([num2str(C(:)), repmat(' ', numel(C), 1), num2str(D(:))]);
I don't know what the performance differences might be, but I can't imagine this loop is too terrible if you put it in a function and let the JIT accelerator work it's magic...
Thank you so much. There is one typo in your last script (nposts = 3 should be nspots = 3). But it works great! Thank you for all the help!
I corrected the typo.
I'm glad it helps.
I still might play around with it over the weekend...

Accedi per commentare.

Più risposte (1)

The indices three positions after C is > 1 can be found using
ind = find(C > 1) + 3;
Ensure that the indices are not larger than the number of elements in C
ind = ind(ind < numel(C));
Start of phase 1
i1 = ind(find(C(ind) > 1, 1, 'first'))
Start of phase 2
ind2 = find(C < 1);
i2 = ind2(find(ind2 - i1 > 0, 1, 'first'));
phase1 = zeros(size(C));
phase1(i1:i2) = 1;

3 Commenti

Thanks for your help! This works for my example C, but my real dataset is a much longer array with alternating phase 1 - phase 2 - phase 1 and so on. This script works to find the first phase 1 but can't find the rest of the phase 1's. How to change the script to get al the fase 1's?
i = 1;
while numel(C) > 1
%insert algorithm from above
phase1{i} = C(i1:i2);
i = i + 1;
if i2 == numel(C), C = []; else, C = C(i2+1:end); end
end
Thanks for helping me. The following script gives me the error but I can't figure out what is the problem?
??? Cell contents assignment to a non-cell array object.
Error in ==> DetermineMeansRecoveryPushFase at 112 phase1{i} = C(i1:i2);*
while numel(C) > 1
ind = find(C> 1) + 3;
ind = ind(ind < numel(C));
i1 = ind(find(C(ind) > 3, 1, 'first'));
ind2 = find(C < 1);
i2 = ind2(find(ind2 - i1 > 0, 1, 'first'));
phase1 = zeros(size(C));
phase1(i1:i2) = 1;
phase1{i} = C(i1:i2);
i = i + 1;
if i2 == numel(C), C = []; else, C = C(i2+1:end); end
end

Accedi per commentare.

Prodotti

Richiesto:

il 23 Set 2015

Modificato:

il 25 Set 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by