Transforming graphs horizontally and vertically to properly compare

I have several files of data which, when plotted, look like the image presented. My problem is that i need them all to be laid on top of each other, with the beginning of the rise of the first sinusoidal wave matched up in the x and y. This is so that I can create an average waveform to reduce noise in the readings. Currently, my code impoorts all my text files, 30 of them, into an array and plots them using the following loop
for K = 1 : 30 %open and read the text files
S{K} = readtable(files(K));
Sr{K} = rmmissing(S{K}); %remove NaN values from the text files
end
for K = 1 : 30 %plot each indivdual graph for comparisons
plot(Sr{K}.(1), Sr{K}.(2), 'displayname', files(K));
hold on
end
I know how to transform them all individually, but cannot figure out how to make it all happen at once. I attempted to use a matchFeatures command, but could not figure out precisely how to do what I wished.
The goal is for the first graph to look similar to the second, where the initial rise is matched, allowing the peaks to be easily compared.
Any help would be appreciated, thank you!

2 Commenti

hello
idea could be to isolate the first peak to do the x / y repositionning
could you share some data
I believe this is not too much complicated

Accedi per commentare.

 Risposta accettata

hello
so I decided to do some x shift on your data based on the 3 major peaks x locs
you can use the regular findpeaks but I like the simplicity and speed of peakseek (fex : PeakSeek - File Exchange - MATLAB Central)
attached for your convenience
also I like this , because it makes the filenames and directoris names sorting correctly , what the regular dir is not capable of
and this is the result so far :
code :
fileDir = pwd; % current directory (or specify which one is the working directory)
S = dir(fullfile(fileDir,'Free*.txt')); % get list of data files in directory
S = natsortfiles(S); % sort file names into natural order , see :
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
% init
ysum = 0;
% plot
figure
hold on
for k = 1:numel(S)
filename = S(k).name; % filenames are sorted
out = readmatrix( fullfile(fileDir, filename));
x = out(:,1);
dx = mean(diff(x),'omitnan');
y = out(:,2);
% take the first n major peaks
n = 3;
[locs, ~]=peakseek(abs(y),5,3);
locs = locs(1:n)';
alllocs(:,k) = locs;
if k>1
xshift = mean(alllocs(:,k) - alllocs(:,1));
else
xshift = 0 ;
xref = x; % keep this as the "good" x array for the fina plot
end
legstr{k} = filename(1:length(filename)-4);
plot(x-xshift*dx,y);
% sum y data for the mean curve)
ysum = ysum +y;
end
ymean = ysum/k; % mean curve
legstr{k+1} = 'mean';
plot(xref,ymean,'k','linewidth',2.5);
xlim([min(xref) max(xref)]);
legend(legstr);

7 Commenti

silightly less robust IMHO but it works too in your case : we can also use simply min or max (without peakseek) to locate either the dominant positive or negative peak
results are the same
here using min to pic the isolated negative peak :
fileDir = pwd; % current directory (or specify which one is the working directory)
S = dir(fullfile(fileDir,'Free*.txt')); % get list of data files in directory
S = natsortfiles(S); % sort file names into natural order , see :
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
% init
ysum = 0;
% plot
figure
hold on
for k = 1:numel(S)
filename = S(k).name; % filenames are sorted
out = readmatrix( fullfile(fileDir, filename));
x = out(:,1);
dx = mean(diff(x),'omitnan');
y = out(:,2);
% take the (single) max positive or negative peak loc
[~, locs]=min(y);
alllocs(:,k) = locs;
if k>1
xshift = (alllocs(:,k) - alllocs(:,1));
else
xshift = 0 ;
xref = x; % keep this as the "good" x array for the fina plot
end
legstr{k} = filename(1:length(filename)-4);
plot(x-xshift*dx,y);
% sum y data for the mean curve)
ysum = ysum +y;
end
ymean = ysum/k; % mean curve
legstr{k+1} = 'mean';
plot(xref,ymean,'k','linewidth',2.5);
xlim([min(xref) max(xref)]);
legend(legstr);
and also potentially less accurate as with only one isolated peak location the s shift correction is a integer value (of samples)
with the first method I averaged the x shift on 3 peaks locs so that the shift can be a multiple of 1/3 of sample => accuracy is up to 3 times better
[Edit: fix typo, not in code]
Nice answer from @Mathieu NOE. I always learn from his posts.
You suggested aligning the traces by shifting the corner of the initial upstroke, horizontally and vertically. The corner of the upstroke is not trivial to identify. @Mathieu NOE smartly recognized that the x-coordinate of first minimum or maximum seems to work well (for the data you have provided), and it is easy to find. Therefore I follow his example and use the first minimum as the reference point through which all traces must pass. I also adjust the traces vertically, which he does not do, I think.
@Mathieu NOE's mean calculation has an issue: it is the mean of the y-values without taking into account the horizontal shifts that have occurred. You can see in his plot that there is a problem with the mean, because the plotted mean (thick black line) is higher than all but 1 trace at t=0.002, and the plotted mean is lower than all the traces at t=0.0025. The real mean can't behave this way. It also shows a one-period roughly sinusoidal oscillation around this time, that is not present in the original data traces. I address this by making an array of shifted traces that is padded with NaNs on one end or the other of each trace, as needed.
My code is simpler than @Mathieu NOE's at the beginning, but also less general. I notice that the data files are all 501 elements long, and the x-vector is the same for all, and the last element of x and y is NaN, for every file. Therefore I exclude the last elements of x and y when loading the data.
The plot below shows that the mean is better behaved now, and the corners of the initial upstroke are now better aligned vertically.
Un-comment the line at the end, to zoom in on the first minimum in the plot. You will see that all traces go through the same minimum point at the same time.
S = dir('Free*.txt');
M=length(S); % number of files
yMin=zeros(1,M); locMin=zeros(1,M); shift=zeros(1,M);
x=[]; y0=[];y=[];
for k = 1:M
out = readmatrix(S(k).name);
x0 = out(1:end-1,1);
y0(:,k) = out(1:end-1,2);
[yMin(k),locMin(k)]=min(y0(:,k));
shift(k)=locMin(k)-locMin(1);
end
N=length(x0); % number of samples in each recording
dx = mean(diff(x0),'omitnan');
shiftMin=min(shift);
shiftMax=max(shift);
% x=original x0 plus values on each end for the shifted traces
x=[x0(1)+[-shiftMax:-1]'*dx; x0; x0(end)+[1:-shiftMin]'*dx];
figure;
for k=1:M
% The next line does a horizontal shift by adding NaNs at the ends
% and a vertical shift by subtracting ymin(k)-ymin(1).
y=[y,[nan(shiftMax-shift(k),1); y0(:,k)-(yMin(k)-yMin(1)); nan(shift(k)-shiftMin,1)]];
plot(x,y(:,k))
hold on
end
yAvg=mean(y,2);
plot(x,yAvg,'-k',LineWidth=2)
% xlim([.003,.004]); ylim([-10,-7]); % zoom into the 1st minimum
good to hear from you...
yes indeed , my bad , I didn't see the elephant in the room !! I really need some holidays....
I was focused on the plot and completely missed the fact that to get the mean curve correctly , the y data must be truncated (remove the samples corresponding to the required x shift)
I solve that with interpolation (interp1) which has the benefit of working with whatever shift is needed (even non integer values of samples)
if xshift is zero , of course no need to interpolate y
regarding (optionnal) y data detrending, I don't know if @Luke wanted any special treatment , but you can add some detrend (order 0 or 1). I let the poster experiment on that and see what works best for him.
here I tested with 1st order detrend, so the curves get's more compact at the end and the final plot is quite nice ...IMHO
BTW, I also show that the good old readclm function (attached ) can be also handy here (I have no problem with readmatrix) as you can also access to the header data if you wanted to store some measurements parameters aside. Could be also faster for large datasets (up to you to try it).
fileDir = pwd; % current directory (or specify which one is the working directory)
S = dir(fullfile(fileDir,'Free*.txt')); % get list of data files in directory
S = natsortfiles(S); % sort file names into natural order , see :
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
for k = 1:numel(S)
filename = S(k).name; % filenames are sorted
[out,head] = readclm(fullfile(fileDir, filename)); % alternative to readmatrix
x = out(:,1);
dx = mean(diff(x));
y = out(:,2);
% take the first n major peaks
n = 3;
[locs, tmp]=peakseek(abs(y),5,3);
alllocs(:,k) = locs(1:n)';
if k>1
xshift = mean(alllocs(:,k) - alllocs(:,1)); % in samples
% remove the uneeded samples from the y data and padd with zeroes , nan
% this can be done by simply a linear interpolation (missing values will be replaced by NaNs)
% shift can be non integer values
xs = x+xshift*dx;
if abs(xshift)>eps
y = interp1(x,y,xs);
end
end
legstr{k} = filename(1:length(filename)-4);
% optionnal : detrend the data
y = detrend(y,1);
% store y data for the mean curve
ystore(:,k) = y;
end
ymean = mean(ystore,2,'omitnan'); % mean curve
legstr{k+1} = 'mean';
% plot
figure
hold on
plot(x,ystore)
plot(x,ymean,'k','linewidth',2.5);
legend(legstr);
Thanks to both of you for these answers, they are very comprehensive and well explained! I am quite new to using MATLAB so you have hlped me a lot with overall knowledge and in this certain scenario.
I knew this was possible and did not know how to accomplish it. Now I do!
Thanks again!
@Luke, you're welcome.
@Mathieu NOE, very nice. Good idea about using detrend too - with that plus your alignment code, the traces really line up well.

Accedi per commentare.

Più risposte (0)

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by