how can I Replace outliers with median of previous observations?

Hello i have some outliers in a 206*174 dataset matrix.. I want to replace them with the median of the 5 previous observaitons using a loop..
how can i do that?
[EDITED, copied from Answer section, Jan]
i will be more clear.. outliers are observations of stationary series with absolute deviations from the median which exceed six times the interquartile range. I want to replace them with the median of the preceding five observations. thanks

7 Commenti

Jan
Jan il 19 Lug 2012
Modificato: Jan il 19 Lug 2012
If you post, what you have done so far, inserting the required changes would be easier. Please explain, what "previous" exactly mean, when you operate on a matrix. And specify what you mean by "outlier" - did you have a method to detect them already? If not, how are they recognized?
if true
% code[t n]=size(NUM)
X=median(NUM)
X1=repmat(X,t,1)% creates a large matrix that
% each column has n times the median value of the column
NUM1=NUM-X1 %substract each row to find the Mean absolute deviation
NUM1=abs(NUM(:,:))%take the absolute value
for j=1:n
Y(:,j)=iqr(NUM(:,j))% find the value of the difference between
%3 and 1 quartile
end
Y1=repmat(Y,t,1)
NUM2=6*Y1% multiply each value x6
outliers=NUM1-NUM2 %an outlier is when the MAD>6*Diff inquartiles
[x w]=find(outliers>0)%x is the row and w the column of each outlier
end
i want now to replace them with the median of the 5 preceding observations
you just need to say, no need for the find
Y(Y==outlieres) = X;
Jan
Jan il 19 Lug 2012
Modificato: Jan il 19 Lug 2012
Please post clarifications of the question by editing the question. This is the location, where readers expect all necessary information.
Let me ask you again: What does "preceding" mean, when you process a matrix? The 5 rows before, the 5 columns before, 5 other matrices processed before? Do you have the indices of the outlöiers already or is this a part of the question?
The more time we waste with guessing, the less time is left for answering.
preceding means with the 5 rows before.. yes i use the find command to find the row and column of outliers.. i dont know if there is better way
So i will repeat.. I have a data of 206*174 observations... rows is time observations and columns is variables.. i want to find the outliers that are defined the the median absolute deviations to be greater 6 times the interquartile range in each variable series.
after that i want to replace each outlier with the median of previous 5 rows. thanks
% %Now we remove outliers like the paper of Stock and Watson 2005(num=data)
[t n]=size(NUM)% row size of data
X=median(NUM) %find the median of each column of NUM
X1=repmat(X,t,1)% creates a large matrix that
% each column has n times the median value of the column
NUM1=NUM-X1 %substract each row to find the Mean absolute deviation
NUM1=abs(NUM(:,:))%take the absolute value
for j=1:n
Y(:,j)=iqr(NUM(:,j))% find the value of the difference between
%3 and 1 quartile.
end
Y1=repmat(Y,t,1)
NUM2=6*Y1% multiply each value x6
outliers=NUM1-NUM2 %an outlier is when the MAD>6*Diff inquartiles
[x w]=find(outliers>0)%x is the row and w the column of each outlier
v=ones(t,n)
v(outliers>0)=0
%Note here that some problems arise for very smooth series so we remove %them for further analysis v(:,[39;84;86;92;95])=1 [x w]=find(v==0)
NUM1=zeros(size(data_st)) j=1
for i=1:t
if v(i,j)==0
NUM1(i,:)=NUM(median(NUM(i-6:i-1,:),1))
elseif v(i,:)==1
NUM1(i,:)=data_st(i,:)
end
j=j+1
if j==175
break
end
end
disp('Done')

Accedi per commentare.

Risposte (1)

Miro
Miro il 19 Lug 2012
Modificato: Miro il 19 Lug 2012
something like this should work
yourthreshold = 10;
Data(Data>yourthreshold) = median(median(Data));
this replaces all values being greater than 10.

Prodotti

Tag

Non è stata ancora inserito alcun tag.

Richiesto:

il 19 Lug 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by