How to compare data series?

Suppose, I'm performing five time-domain simulations with 5 different coefficient values (say beta = [0.2, 0.3, 0.4, 0.5, 0.6]). For each beta value, I obtain a complete set of results as a time series (say, for one beta value, results = [speed, angle, wind force, heat]). I can then plot each of those results against time. The goal is to identify the effect of beta on each simulated parameter.
I can plot each result for each beta against time to see the qualitative difference/comparison. But the problem is that the minute deviations cannot be seen clearly in these type of plots.
I read about Dynamic Time Warping (DTW) but it is a bit difficult to wrap my head around. Is there any other (rather simpler) method that one can sue to analyse these type of time series data?

 Risposta accettata

William Rose
William Rose il 10 Mag 2024

1 voto

It is hard to say without having the actual data.
If you restrict your analysis to one variable at a time, then you may want to plot "deviation from the mean" as a function of time, for the different value of beta, where " the mean" is the mean at each instant, for all vaues of beta examined. This could reveal features that might be hard to see otherwise.
If you want to analyze effects of beta on all four variables simultaneously: you may think of your system as evolving in a four-dimensional space, over rtime (time would be a 5th dimension). Since the four variables have different units, you may want to remove the mean from each, and normalize each variable by its standard deviation, in order to have a dimensionless "z-score" for each variable, as a function of time.

6 Commenti

Thank you for the response, @William Rose! While I like the sound of the second approach, I think the first one is better for my case, and it will be rather easier to implement from the sound of it.
I'm not sure how to go about calculating the deviation from the mean. However, I have attached a sample data file, and a simple code I have.
load('sampleData.mat')
whos
p_mean = mean([p1(:), p2(:), p3(:), p4(:), p5(:)], 'omitnan');
p_std = std([p1(:), p2(:), p3(:), p4(:), p5(:)], 'omitnan');
figure
bar([p_mean; p_std])
xticklabels({'Mean', 'Std Dev', 'Median', 'Range'})
legend('x_1', 'x_2', 'x_3', 'x_4', 'x_5')
Could you please explain a bit more about what you meant with the first approach?
load('sampleData');
p=[p1;p2;p3;p4;p5];
% pzm=(p with zero mean)
% I.e. the mean vaue of array pzm, at each instant, is zero
pzm=p-mean(p);
% plot results
figure
subplot(211)
plot(time,p1,'-r',time,p2,'-g',time,p3,'-b',time,p4,'-c',time,p5,'-m')
title('Raw p(t)');
legend('\beta=1','2','3','4','5')
subplot(212)
plot(time,pzm(1,:),'-r',time,pzm(2,:),'-g',time,pzm(3,:),'-b',...
time,pzm(4,:),'-c',time,pzm(5,:),'-m')
legend('\beta=1','2','3','4','5')
xlabel('Time'); title('p_{zm}(t)')
Now a few things are evident, which were not obvious to me in the initial post: 1. the frequency of oscillation starts out the same, but then seems to be inversely related to the value of beta. In other words, the oscillations are fastest for beta=1 and slowest for beta=5, with in between being in between. 2. The moving-average mean value is inversely related to beta. You can do more analyses and plotting to demonstrate points 1 and 2.
load('sampleData');
beta=[.2,.3,.4,.5,.6];
p=[p1;p2;p3;p4;p5];
pzm=p-mean(p);
fs=(length(time)-1)/(time(end)-time(1)); % sampling rate
% Find peaks in each trace:
% For p1, p2: Find peak heights and locations, to make an illustrative plot.
% For p3, p4, p5: Find locs only, since only need locs to compute instataneous freq.
[pks1,locs1]=findpeaks(p1,fs);
[pks2,locs2]=findpeaks(p2,fs);
[~,locs3]=findpeaks(p3,fs);
[~,locs4]=findpeaks(p4,fs);
[~,locs5]=findpeaks(p5,fs);
% compute instantaneopus frequency
instFreq={1./diff(locs1); 1./diff(locs2); 1./diff(locs3); 1./diff(locs4); 1./diff(locs5)};
% Next: time associated with each estimate of instFreq
tInstFreq={locs1(2:end);locs2(2:end);locs3(2:end);locs4(2:end);locs5(2:end)};
% plot results
figure
subplot(311)
plot(time,p1,'-r',time,p2,'-g',time,p3,'-b',time,p4,'-c',time,p5,'-m')
title('Raw p(t)');
for i=1:5, legstr{i}=sprintf('b=%.1f',beta(i)); end
legend(legstr)
subplot(312)
plot(time,p1,'-r',locs1,pks1,'r*',time,p2,'-b',locs2,pks2,'bx')
legend('p1','p1 peaks','p2','p2 peaks')
%plot(time,pzm(1,:),'-r',time,pzm(2,:),'-g',time,pzm(3,:),'-b',time,pzm(4,:),'-c',time,pzm(5,:),'-m')
%legend('\beta=1','2','3','4','5')
title('p(t) with peaks')
subplot(313)
plot(tInstFreq{1},instFreq{1},'-r.',tInstFreq{2},instFreq{2},'-g.',tInstFreq{3},instFreq{3},'-b.',...
tInstFreq{4},instFreq{4},'-c.',tInstFreq{5},instFreq{5},'-m.')
legend(legstr)
title('Instantaneous Frequency'); xlabel('Time');
The middle plot above shows that findpeaks() is working as we hope it will. I have defined instantaneous frequency as the reciprocal of the time between successive peaks. The plot of instantaneous frequency versus time confirms what I said in my earlier post: instFreq is initially the same for all values of beta, but then instFreq diverges, with instFreq being higher when beta is smaller. The plot also shows that instFreq oscillates slowly.
With appropriate smoothing, you will be able to show how the mean value of p(t) (mean over approximately one cycle) is different for different vaues of beta.
Jake
Jake il 10 Mag 2024
@William Rose, this is very nice, and I can understand the differences of the approach. One question though, in the middle plot (p(t) with peaks vs Time), you have chosen p1 and p2 and not p1,p2,p3,... (all). Was there a specific reason for this, or did you simply chose 2 to convey that findpeaks() work in this context?
I'm not sure if I understood what you meant by the last sentence ("With appropriate smoothing, you will be able to show how the mean value of p(t) (mean over approximately one cycle) is different for different vaues of beta.") though.
Regardless, I'm very thankful, and will accept the answer :)
Here are plots which show more about how p(t) is affected by the value of beta.
load('sampleData');
beta=[.2,.3,.4,.5,.6];
% Compute smoothed versions of p
ps=[smooth(p1,220),smooth(p2,220),smooth(p3,220),smooth(p4,220),smooth(p5,220)];
% Compute pzm=p_zeromean and smoothed version of pzm
p=[p1;p2;p3;p4;p5];
pzm=p-mean(p);
pzms=[smooth(pzm(1,:),220),smooth(pzm(2,:),220),smooth(pzm(3,:),220),...
smooth(pzm(4,:),220),smooth(pzm(5,:),220)];
% plot results
figure
subplot(211)
plot(time,p1,'-r',time,p2,'-g',time,p3,'-b',time,p4,'-c',time,p5,'-m')
for i=1:5, legstr{i}=sprintf('b=%.1f',beta(i)); end
legend(legstr,Location='southwest'); title('Raw p(t)')
subplot(212)
plot(time,ps(:,1),'-r',time,ps(:,2),'-g',time,ps(:,3),'-b',time,ps(:,4),'-c',time,ps(:,5),'-m')
legend(legstr,Location='southwest'); title('Smoothed p(t)'); xlabel('Time')
figure
subplot(211)
plot(time,pzm(1,:),'-r',time,pzm(2,:),'-g',time,pzm(3,:),'-b',time,pzm(4,:),'-c',time,pzm(5,:),'-m')
legend(legstr,Location='southwest'); title('p_{zm}(t)');
subplot(212)
plot(time,pzms(:,1),'-r',time,pzms(:,2),'-g',time,pzms(:,3),'-b',time,pzms(:,4),'-c',time,pzms(:,5),'-m')
legend(legstr,Location='southwest'); title('Smoothed p_{zm}(t)'); xlabel('Time');
The code above uses smooth() with a width of 220 points. I chose this width because it is about 2 cycles long, so it does a moving average of approximately two cycles of data. The third plot in the previous post showed that the mean frequency (mean across all times and across all five values of beta) is in the ballpark of 0.09, which means the duration of one cycle is about 11, and two cycles is 22. Sampling rate is 10, so that is 220 points per two cycles. Which is just an approximate value. You could try to get fancier, for example, by taking the mean value between successive peaks on each separate trace.
The top figure shows that the smoothed p(t) traces are together initially, then diverge, with smoothed p(t) being higher when beta is greater. The bottom figure shows the samed thing, but the differences are more obvious than in the upper figure, because the bottom figure shows the zero-mean version of p. In both figures, the smoothing is not perfect, because the width of the smoothing window does not exactly equal the oscillation period, which varies over time and from trace to trace. The smoothed signals are less smooth as time approaches 200, because the width of the smoothing window decreases at the edge.
You wrote: "this is very nice, and I can understand the differences of the approach. One question though, in the middle plot (p(t) with peaks vs Time), you have chosen p1 and p2 and not p1,p2,p3,... (all). Was there a specific reason for this, or did you simply chose 2 to convey that findpeaks() work in this context?"
Yes I only showed p1, p2 to show that findpeaks() is working in a reasonable way. If the data were not so smooth, then findpeaks would probably find spurious peaks.
and you wrote "I'm not sure if I understood what you meant by the last sentence ("With appropriate smoothing, you will be able to show how the mean value of p(t) (mean over approximately one cycle) is different for different vaues of beta.") though."
See my recent comment, which demonstrates smoothing by finidng the moving average. I ended up using a moving average width of approximately two cycles, rather than one cycle, which I had originally suggested.

Accedi per commentare.

Più risposte (0)

Categorie

Tag

Richiesto:

il 10 Mag 2024

Modificato:

il 23 Giu 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by