Get zeroes in a binned data set

5 visualizzazioni (ultimi 30 giorni)
anton fernando
anton fernando il 14 Set 2014
Modificato: Guillaume il 24 Set 2014
I am trying to put a large number of data set which is in as a matrix to certain bins. But I get zeroes in my binned data set and I don't know what to do. My program is a bit long. Once I print one data set I get random zeros everywhere in the data set. I will be grateful if anyone can help me.
clc;
clear all;
ncdisp('at.nc')
T = ncread('at.nc','T',[1 1], [Inf Inf], [1 1]);
year= ncread('at.nc','year',[1],[Inf], [1]);
month= ncread('at.nc','month',[1], [Inf], [1]);
alt= ncread('at.nc','altitude',[1], [Inf], [1]);
day= ncread('at.nc','day',[1],[Inf], [1]);
flag= ncread('at.nc','quality flag',[1 1], [Inf Inf], [1 1]);
for i=1:32025
x(i)=year(i);
d(i)=day(i);
m(i)=month(i);
end
for i=1:32025
if ((m(i)==1 )&& (d(i)>=0.5) && (d(i)<=15.25))
L(i)=1;
elseif d(i)>=15.26 && d(i)<=30.5 && m(i)==1
L(i)=2;
elseif (d(i)==31 && m(i)==1) || (d(i)<=14.75 && m(i)==2)
L(i)=3;
elseif (d(i)>=14.76 && m(i)==2) || (d(i)<=1.5 && m(i)==3)
L(i)=4;
elseif (d(i)>=16.5 && d(i)==31 && m(i)==3)
L(i)=5;
elseif (d(i)>=16.5 && m(i)==3) || (d(i)<=31 && m(i)==3)
L(i)=6;
elseif (d(i)==1 && m(i)==4) || (d(i)<=15.75 && m(i)==4)
L(i)=7;
elseif (d(i)>=15.75 && m(i)==4) || (d(i)==30 && m(i)==4)
L(i)=8;
elseif (d(i)>=1 && m(i)==5) || (d(i)<=16.75 && m(i)==5)
L(i)=9;
elseif (d(i)>=16.25 && m(i)==5) || (d(i)==31 && m(i)==5)
L(i)=10;
elseif (d(i)==1 && m(i)==6) || (d(i)<=15.75 && m(i)==6)
L(i)=11;
elseif (d(i)>=15.25 && m(i)==6) || (d(i)==30 && m(i)==6)
L(i)=12;
elseif (d(i)==1 && m(i)==7) || (d(i)<=16.75 && m(i)==7)
L(i)=13;
elseif (d(i)>=16.25 && m(i)==7) || (d(i)==30 && m(i)==7)
L(i)=14;
elseif (d(i)==1 && m(i)==8) || (d(i)<=15.75 && m(i)==8)
L(i)=15;
elseif (d(i)>=15.75 && m(i)==8) || (d(i)==31 && m(i)==8)
L(i)=16;
elseif (d(i)>=0.5 && m(i)==9) || (d(i)<=15.75 && m(i)==9)
L(i)=17;
elseif (d(i)==30 && m(i)==9) || (d(i)<=15.75 && m(i)==9)
L(i)=18;
elseif (d(i)==1 && m(i)==10) || (d(i)<=15.75 && m(i)==10)
L(i)=19;
elseif (d(i)<=31 && m(i)==10) || (d(i)>=15.75 && m(i)==10)
L(i)=20;
elseif (d(i)==1 && m(i)==11) || (d(i)<=15.75 && m(i)==11)
L(i)=21;
elseif (d(i)==30 && m(i)==11) || (d(i)>=15.75 && m(i)==11)
L(i)=22;
elseif (d(i)==1 && m(i)==12) || (d(i)<=15.75 && m(i)==12)
L(i)=23;
elseif (d(i)==31 && m(i)==12) || (d(i)>=15.75 && m(i)==12)
L(i)=24;
else L(i)
end
end
n=0;
for i=1:120
for m=1:24
w4(i,m)=1;
w5(i,m)=1;
w6(i,m)=1;
w7(i,m)=1;
w8(i,m)=1;
w9(i,m)=1;
w10(i,m)=1;
w11(i,m)=1;
w12(i,m)=1;
w13(i,m)=1;
end
end
for j=1:32025
if (x(j)==2004)
for i=1:120
if (flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
E4(i,L(j),w4(i,L(j)))= T(i,j);
E4(i,L(j),w4(i,L(j)));
w4(i,L(j))= w4(i,L(j))+1;
end
end
elseif (x(j)==2005)
if (flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E5(i,L(j),w5(i,L(j)))= T(i,j)
w5(i,L(j))= w5(i,L(j))+1;
end
end
elseif (x(j)==2006)
if(flag(i,j)<=2) && ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E6(i,L(j),w6(i,L(j)))= T(i,j);
w6(i,L(j))= w6(i,L(j))+1;
end
end
elseif (x(j)==2007)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E7(i,L(j),w7(i,L(j)))= T(i,j);
w7(i,L(j))= w7(i,L(j))+1;
end
end
elseif (x(j)==2008)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E8(i,L(j),w8(i,L(j)))= T(i,j);
w8(i,L(j))= w8(i,L(j))+1;
end
end
elseif (x(j)==2009)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E9(i,L(j),w9(i,L(j)))= T(i,j);
w9(i,L(j))= w9(i,L(j))+1;
end
end
elseif (x(j)==2010)
if (flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E10(i,L(j),w10(i,L(j)))= T(i,j);
w10(i,L(j))= w10(i,L(j))+1;
end
end
elseif (x(j)==2011)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E11(i,L(j),w11(i,L(j)))= T(i,j);
w11(i,L(j))= w11(i,L(j))+1;
end
end
elseif (x(j)==2012)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E12(i,L(j),w12(i,L(j)))= T(i,j);
w12(i,L(j))= w12(i,L(j))+1;
end
end
elseif (x(j)==2013)
if(flag(i,j)<=2)&& ((T(i,j))>0)&&(~isnan(T(i,j)))
for i=1:120
E13(i,L(j),w13(i,L(j)))= T(i,j);
w13(i,L(j))= w13(i,L(j))+1;
end
end
end
end
I printed E4. The results I get is attached in the question.

Risposta accettata

Guillaume
Guillaume il 15 Set 2014
I don't really know the answer to your problem, mostly because your program is so difficult to follow with all these elseif it's difficult to see where the bug, if any, could be. Possibly, it's because some of your elseif may be wrong, such as:
elseif (d(i)>=16.5 && d(i)==31 && m(i)==3)
if d == 31, it's obviously >= 16.5, so the _d(i)>=16.5 serves no purpose
elseif (d(i)==1 && m(i)==4) || (d(i)<=15.75 && m(i)==4)
the first part (before the | |) is always true when the second is, thus serves no purpose. There may be more elseif like that.
So, I think your first task should be to simplify your program. Most of what you're doing can be achieved with a lot less code. For example:
  • copying matrix: I'm not sure why you copy year, day, month to new arrays with names which are less descriptive but you don't need a loop to do that,
x = year; % or x = year(1:32025) if year has more elements
works just as well.
Your L calculation looks like it's partitioning the year into several periods and finding in which period a particular month/day combination falls in. You're basically finding in which bin of an histogram a particular date falls in. The 2nd output of histc tells you that. You just need to transform your month/day dual variable into a single one. This is easily done with datenum and datevec_, e.g.:
dvdaymonth = [zeros(32025, 1) m' d']; %assuming m and d are row vector. Don't transpose if column
%dvdaymonth is a datevector where each row is year month day. year is always 0
dndaymonth = datenum(dvdaymonth); %transform into a single number
dvthresholds = [
0 0 0
0 1 15.25
0 1 30
0 2 14.75
... and so on
0 12 31]; %again, each row is year, month, day.
dnthresholds = datenum(dvthresholds);
[~, L] = histc(dndaymonth, dnthresholds);
You use loops to create matrices of one, use the ones function:
w4 = ones(120, 4);
...
I'm not sure what you're doing next in the code, it looks like you're building an histogram. maybe explain what you're trying to achieve and we'll tell you how to simplify it.
With simpler code, it'll be a lot easier to find where it's going wrong.
  10 Commenti
anton fernando
anton fernando il 24 Set 2014
Thank you again. Everything you said worked perfectly. Now the problem I have is that as I explained earlier I have data according to the year. Since I have assigned an index to each bin according to the month and day as you said, now I want to seperate data according to the year and take average of each bin. I have data for 10 years. And each year I have 48 bins. Altogether it's 48*10 bins. I need to calculate average of values in each bin. Thanks for the help.
Guillaume
Guillaume il 24 Set 2014
Modificato: Guillaume il 24 Set 2014
If you want to take the year into account, you just have to add it to the datenum calculation:
dvdate = double([year month day)];
For the thresholds, either you manually define them for all the years (a bit tedious), or you define it as:
dvthresholds = [
1 15.25
1 30.5
2 14.75
... %same as before
];
thresholdyears = repmat(min(year):max(year), size(dvthresholds, 1), 1);
dvthresholds = double([thresholdyears(:) dvthresholds]);
Once you've got your bins distribution L, to calculate the average of T per bin:
Taverage = accumarray(L, double(T), [], @mean);

Accedi per commentare.

Più risposte (1)

Image Analyst
Image Analyst il 15 Set 2014
What are the bin centers or edges? Is there a value that you know for a fact should have gone into a bin yet the bin is still zero? For example bin #123 covers values from 5460 to 5600 (or whatever) and you know for a fact that you have a data value of 5500, which should have got counted in bin #123 but bin #123 is zero?
  5 Commenti
Image Analyst
Image Analyst il 15 Set 2014
OK, look at your PDF. It shows bins 14 and up are all zeros. The bins are counts , correct? Like a histogram , right? What number in your data did you expect to be logged into bin #14?
anton fernando
anton fernando il 15 Set 2014
Modificato: anton fernando il 15 Set 2014
The pdf gives you the data of E4(volume,bin number, data) array. I am not supposed to get zeroes in data. Because I have given the condition in my codes if it is zero it should be discarded.
You can see in my codes that I have divided number of days of the year(365)by 24 and have assigned a number for each 15.25 days. So if the date is 2004/feb/3 the bin number is 3. Then the data should go to E4(volume,3,data).Just like that I checked each date of the data and assigned it into huge matrices E4,E5 etc. E4 gives you the data in 2004. it is a 3 dimensional matrix. E4(volume,bin number,data). I am not an expert in Matlab and I appreciate your patience. So when I explain something if it is not clear let me know. I will do my best. Thank you for sparing time for this.

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by