Create 30 minute bins by reading time stamps

I have a file with time stamps, due to error in the files (gaps in time due to faulty equipment) binning 30 minute sections using the following code creates errors in the actograms produced.
function[Average] = Av_30min(y)
Average = zeros(288,size(y,2));
k = 1;
for l =1:288
Average(l,1) = mean(y(k:k+(round(length(y)/288)-1),1));
k = k+((round(length(y)/288)-1));
end
end
I want to read from say 11:30:00 - 12:00:00 and average this bin and so on and so forth. Can anyone help?
Thanks

 Risposta accettata

What do the ‘Calculated time’ values represent? Are they fractions of a day (from about ¼ to about 3¼ days) or something else?
Until that is resolved, converting them into something useful is not likely to be possible.
Here, I used sortrows and then fillmissing and resample to create some sort of order.
% imshow(imread('Timestamps in output.PNG'))
% function readfaultydata
% clear; close all
Data = xlsread('Data.xlsx',1,'A2:F60099');
T1 = readtable('Data.xlsx', 'VariableNamingRule','preserve');
t = Data(:,1);
dt = t(2)-t(1); % Your sample rate
t = (1:length(t))*dt; % Linear timeline
x = Data(:,3);
y = Data(:,4);
z = Data(:,5);
subplot(311);plot(t,x);grid;
subplot(312);plot(t,y);grid;
subplot(313);plot(t,z);grid;
T1 = sortrows(T1,1);
T1 = fillmissing(T1, 'nearest') % Sorted With NaN Values Interpolated
T1 = 60103×9 table
Calculated time Device X Y Z LDR Var7 Var8 Var9 _______________ __________ __ __ __ ___ ______ _______ _______ 0.26653 {'BSB_6A'} 0 70 56 0 0.2611 0.85946 0.96134 0.26664 {'BSB_6A'} 0 61 72 0 0.2611 0.85946 0.96134 0.26676 {'BSB_6A'} 5 57 67 0 0.2611 0.85946 0.96134 0.26687 {'BSB_6A'} 9 28 57 0 0.2611 0.85946 0.96134 0.26699 {'BSB_6A'} 60 0 44 0 0.2611 0.85946 0.96134 0.26711 {'BSB_6A'} 62 52 57 0 0.2611 0.85946 0.96134 0.26722 {'BSB_6A'} 48 36 70 0 0.2611 0.85946 0.96134 0.26734 {'BSB_6A'} 66 33 63 0 0.2611 0.85946 0.96134 0.26745 {'BSB_6A'} 70 50 60 0 0.2611 0.85946 0.96134 0.26757 {'BSB_6A'} 34 26 56 0 0.2611 0.85946 0.96134 0.26769 {'BSB_6A'} 51 47 75 0 0.2611 0.85946 0.96134 0.2678 {'BSB_6A'} 17 56 64 0 0.2611 0.85946 0.96134 0.26792 {'BSB_6A'} 50 40 29 0 0.2611 0.85946 0.96134 0.26803 {'BSB_6A'} 26 65 79 0 0.2611 0.85946 0.96134 0.26815 {'BSB_6A'} 22 73 85 0 0.2611 0.85946 0.96134 0.26826 {'BSB_6A'} 51 70 79 0 0.2611 0.85946 0.96134
VN = T1.Properties.VariableNames;
% timestats = [mean(diff(T1{:,1})) std(diff(T1{:,1}))]
Ts = mean(diff(T1{:,1}))
Ts = 4.9837e-05
Fs = 1/Ts
Fs = 2.0065e+04
Fn = Fs/2;
[XYZr, tr] = resample(T1{:,[3 4 5]}, T1{:,1}, Fs);
% format longg
% XYZr(end-4:end,:)
XYZr = XYZr(1:end-3,:);
tr = tr(1:end-3);
% XYZr(end-4:end,:)
x = XYZr(:,1);
y = XYZr(:,3);
z = XYZr(:,3);
figure
plot(tr, XYZr)
grid
xlabel([string(VN{1}) + " (Sorted & Resampled)"])
ylabel("Amplitude")
figure
tiledlayout(3,1)
for k = 1:size(XYZr,2)
nexttile
plot(tr, XYZr(:,k))
grid
ylabel("Amplitude")
title(["Column "+k])
end
xlabel([string(VN{1}) + " (Sorted & Resampled)"])
figure
scatter3(x,y,z, 10, z, '.')
colormap(turbo)
grid on
xlabel('X')
ylabel('Y')
zlabel('Z')
.

7 Commenti

It's 10s in fractions of a day.
Thanks so much for all of this, making it seem like this problem is ever more unsolveable due to the errored data, hopefully our device gets fixed so the next experiments are simpler to analyse!
The device records movement every 10 seconds but due to the error there are time gaps resulting in this complexity. I have revised the data to include date aswell as the time, I'm not sure whether that makes it easier to analyse or not. See attached document.
Basically I want to get to a point where I have a 48x6 double ie 6 days from 7pm to 7am, but because of the time jumps in the data my results have been time shifted in everything I have tried so far.
Also would like to get an 8640x3 double for 1 minute bins, the same 6 days of data but 1 minute averages. But again get massive time drift when i do this because of the error in the data.
Using the following function for this part:
function[Min_Average] = Av_1min(y)
k = 1;
Min_Average = zeros(8640,size(y,2));
for l =1:8640
Min_Average(l) = mean(y(k:k+(round(length(y)/8640)-1)));
k = k+((round(length(y)/8640)-1));
end
end
Hopefully this makes my aim much more clearer: have been trying to use datetime to work with the data but my level of MATLAB is far too low at this moment in time.
@Bradley — So in this data set there are 3 days of data (from about ¼ to about 3¼ days) at 10s intervals? (I sorted and then resampled them, because otherwise the times are all over the place, and resampling to a uniform sampling frequency is necessary for any signal processing you may want to do. I can leave out the resampling step if necessary, that is if it corrupts the original time steps.)
I would like to be able to get the times into days-hours-minutes-seconds format. If I can create a timetable from these data, I can use retime to calculate the 30-minute averages. (The MATLAB date and time functions that were introduced a few years ago are tremendously powerful. It would be nice to be able to use them.)
In this data there are supposed to be 6 days of data, but because of the time jumps some hours are missing. Hence the difficulty. See attached data with dates included form 15/11/2023 - 21/11/2023.
I just want an average value for each 30 minute timepoint on each day, so a value for 1900-1930 for each subject (KO1, KO2, WT) on each day, so reading the times. The code Matthieu showed works but the time shifts as LDR should be 1 from 0700-1900 and 0 from 1900-0700 - see attached excel sheet 30 min bins, as seen there are time jumps after first 24 hours.
That sounds incredibly useful and powerful but i fear may not work due to the data error!
Using timetable and retime to do the 30-minute averages —
files = dir('*.xlsx');
for k = 1:numel(files)
filename{k} = files(k). name;
Data{k,:} = readtable(filename{k}, 'VariableNamingRule','preserve');
end
filename{:}
ans = '30min bins.xlsx'
ans = 'DataRevised.xlsx'
Data{1}
ans = 332×9 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 ____________________ ____ ________ _________ ________ ____ _______ _______ ____ 01-Jan-1900 07:30:02 NaN 4.5746 0.022099 0.14365 1 0.32697 0.33529 NaN 01-Jan-1900 08:00:02 NaN 5.4972 12.884 0.9116 1 0.34578 0.35404 NaN 01-Jan-1900 08:30:02 NaN 0.22099 0 13.928 1 NaN NaN NaN 01-Jan-1900 09:00:02 NaN 0.32044 0 0.044199 1 NaN NaN NaN 01-Jan-1900 09:30:02 NaN 0.25414 4.5967 0 1 NaN NaN NaN 01-Jan-1900 10:00:02 NaN 0.1547 0.01105 0 1 NaN NaN NaN 01-Jan-1900 10:30:02 NaN 10.105 0.01105 0.033149 1 NaN NaN NaN 01-Jan-1900 11:00:02 NaN 5.8398 0.1989 0.40884 1 NaN NaN NaN 01-Jan-1900 11:30:02 NaN 0.14917 45.188 17.475 1 NaN NaN NaN 01-Jan-1900 12:00:02 NaN 0.033149 0.071823 0.1768 1 NaN NaN NaN 01-Jan-1900 12:30:02 NaN 0.47514 0.016575 0.038674 1 NaN NaN NaN 01-Jan-1900 13:00:02 NaN 0.093923 0.033149 0.1547 1 NaN NaN NaN 01-Jan-1900 13:30:02 NaN 3.1713 0.0055249 5.8674 1 NaN NaN NaN 01-Jan-1900 14:00:02 NaN 3.9006 2.884 7.8674 1 NaN NaN NaN 01-Jan-1900 14:30:02 NaN 0.082873 6.337 0 1 NaN NaN NaN 01-Jan-1900 15:00:02 NaN 0.21547 0.044199 0 1 NaN NaN NaN
Data{2} % Analyse 'Data{2}'
ans = 50317×6 table
Calculated time Device KO1 KO2 WT LDR ____________________ __________ ___ ___ __ ___ 15-Nov-2023 19:00:58 {'BSB_6A'} 0 12 0 0 15-Nov-2023 19:01:08 {'BSB_6A'} 0 53 0 0 15-Nov-2023 19:01:18 {'BSB_6A'} 0 78 0 0 15-Nov-2023 19:01:28 {'BSB_6A'} 0 83 0 0 15-Nov-2023 19:01:38 {'BSB_6A'} 0 45 0 0 15-Nov-2023 19:01:47 {'BSB_6A'} 0 70 0 0 15-Nov-2023 19:01:57 {'BSB_6A'} 0 70 0 0 15-Nov-2023 19:02:07 {'BSB_6A'} 7 59 0 0 15-Nov-2023 19:02:17 {'BSB_6A'} 10 53 0 0 15-Nov-2023 19:02:27 {'BSB_6A'} 0 55 0 0 15-Nov-2023 19:02:37 {'BSB_6A'} 0 81 0 0 15-Nov-2023 19:02:47 {'BSB_6A'} 0 33 0 0 15-Nov-2023 19:02:57 {'BSB_6A'} 0 39 6 0 15-Nov-2023 19:03:07 {'BSB_6A'} 0 60 66 0 15-Nov-2023 19:03:17 {'BSB_6A'} 0 60 83 0 15-Nov-2023 19:03:27 {'BSB_6A'} 0 59 82 0
DataRevisedTT = table2timetable(Data{2}(:,[1 3 4 5 6])) % Need To Leave Out 'Device' For This to Work
DataRevisedTT = 50317×4 timetable
Calculated time KO1 KO2 WT LDR ____________________ ___ ___ __ ___ 15-Nov-2023 19:00:58 0 12 0 0 15-Nov-2023 19:01:08 0 53 0 0 15-Nov-2023 19:01:18 0 78 0 0 15-Nov-2023 19:01:28 0 83 0 0 15-Nov-2023 19:01:38 0 45 0 0 15-Nov-2023 19:01:47 0 70 0 0 15-Nov-2023 19:01:57 0 70 0 0 15-Nov-2023 19:02:07 7 59 0 0 15-Nov-2023 19:02:17 10 53 0 0 15-Nov-2023 19:02:27 0 55 0 0 15-Nov-2023 19:02:37 0 81 0 0 15-Nov-2023 19:02:47 0 33 0 0 15-Nov-2023 19:02:57 0 39 6 0 15-Nov-2023 19:03:07 0 60 66 0 15-Nov-2023 19:03:17 0 60 83 0 15-Nov-2023 19:03:27 0 59 82 0
% DataRevisedTT.('Calculated time')
DataRevisedTT30 = retime(DataRevisedTT, 'regular', 'mean', 'TimeStep',minutes(30)) % Use 'retime' To Calculate 30-Minute 'mean' Values
DataRevisedTT30 = 289×4 timetable
Calculated time KO1 KO2 WT LDR ____________________ ______ ________ ______ ___ 15-Nov-2023 19:00:00 16.046 57.131 61.857 0 15-Nov-2023 19:30:00 39.322 60.456 60.317 0 15-Nov-2023 20:00:00 39.067 60.972 60.206 0 15-Nov-2023 20:30:00 26.85 61.622 65.95 0 15-Nov-2023 21:00:00 23.8 62.456 73.011 0 15-Nov-2023 21:30:00 20.111 56.239 52.256 0 15-Nov-2023 22:00:00 41.272 53.517 65.856 0 15-Nov-2023 22:30:00 23.8 55.422 66.206 0 15-Nov-2023 23:00:00 27.711 48.967 66.189 0 15-Nov-2023 23:30:00 47.567 44.594 57.467 0 16-Nov-2023 00:00:00 30.083 43.65 60.528 0 16-Nov-2023 00:30:00 36.522 44.494 49.117 0 16-Nov-2023 01:00:00 35.667 39.033 61.156 0 16-Nov-2023 01:30:00 20.161 25.444 46.489 0 16-Nov-2023 02:00:00 7.9111 0.083333 41.811 0 16-Nov-2023 02:30:00 27.933 0.083333 4.1222 0
VN = DataRevisedTT30.Properties.VariableNames; % Variable Names
figure
stairs(DataRevisedTT30.('Calculated time'), DataRevisedTT30{:,1:end})
grid
xlabel('Calculated time')
ylabel('Dependent Variable Value')
legend(VN{:}, 'Location','best')
title(["Entire ‘"+string(filename{2})+" Record: 30-Minute Averages"])
[y,m,d] = ymd(DataRevisedTT30.('Calculated time'));
[G,ID] = findgroups(d);
Days = accumarray(G, (1:numel(G)).', [], @(x){DataRevisedTT30(x,:)}) % 'timetable' Arrays For Each Day
Days = 7×1 cell array
{10×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {39×4 timetable}
figure % Plot Daily 'timetable' Arrays
tiledlayout(4,2)
for k = 1:numel(Days)
nexttile
stairs(Days{k}.('Calculated time'), Days{k}{:,1:end})
grid
end
hl = legend(VN{:}, 'NumColumns',2);
nexttile;
Ax = gca;
Ax.Visible = 0;
pos = Ax.Position;
hl.Position = pos;
title(hl, "Common Legend")
sgtitle('Daily 30-Minute Averages')
Having accurate datetime values in the ‘DataRevised.xlsx’ file made all the difference.
Plotting the individual days makes the data a bit easier to visualise.
.
Thank you very much! This works perfectly!
My pleasure!
If my Answer helped you solve your problem, please Accept it!
.

Accedi per commentare.

Più risposte (2)

Alexander
Alexander il 13 Dic 2023
Whers is this file Data.xlsx from? I think someone has corrupted the "Calculated time" column. In this column is a mixture of formulars and manual input (see cell A1793:A1795, e.g.). If you copy A3 and paste it to A60099 your timeline is correct your program might run (I'll attach a modified xlsx). Hopefully this was your intension.

2 Commenti

Hi Alexander,
Thanks for the suggestion but that is exactly the issue, if the data ran sequentially then my analytical code would have worked fine, but alas here we are trying to find another solution to atleast make some use of this data.
Here some code, w/o editing the xlsx:
function readfaultydata
clear; close all
Data = xlsread('Data.xlsx',1,'A2:F60099');
t = Data(:,1);
dt = t(2)-t(1); % Your sample rate
t = (1:length(t))*dt; % Linear timeline
x = Data(:,3);
y = Data(:,4);
z = Data(:,5);
subplot(311);plot(t,x);grid;
subplot(312);plot(t,y);grid;
subplot(313);plot(t,z);grid;
end
Hope it helps.

Accedi per commentare.

hello
this is certainly not the best and most modern way to solve your problem , but as I have not really started digging with timetables and alike , so here's a (very) low level approach
someone younger / smarter may come up with a 2 lines solutions, but for the time being this is what I can offer
in very short , I am simply looking where 30 or 00 min appears in your data and the duration is exactly 180 samples (= 30 mins) then this data is averaged and saved
the new time data correspond to the beginning of the 30 min buffer
also I didn't copy paste the header line, I think you could manage that by yourself...
result should look like :
code
y = xlsread('Data.xlsx');
%% convert y firt column (time HH:MM:SS) to hours (h), minutes (m) and seconds (s)
h = y(:,1)*24;
m = 60*(h - floor(h));
s = 60*(m - floor(m));
h = floor(h);
m = floor(m);
s = round(s);
id1 = (m ==30); % find time index matching the 30 min timestamp
[begin1,ends1] = find_start_end_group(id1);
id2 = (m ==00); % find time index matching the 00 min timestamp
[begin2,ends2] = find_start_end_group(id2);
begin = sort([begin1;begin2]); % concat all index corresponding to 00 or 30 min time stamps
%% main loop
k = 0;
for ci = 1:numel(begin)-1
ind_start = begin(ci);
ind_stop = begin(ci+1);
samples = ind_stop - ind_start;
if samples == 180 % only valid segments are processed
k = k+1;
data = y(ind_start:ind_stop,2:end);
mean_data{k,1} = mean(data,'omitnan');
date{k,1} = [sprintf('%02d', h(ind_start)) ':' sprintf('%02d', m(ind_start)) ':' sprintf('%02d', s(ind_start))]; % a very crude method to create the time string
end
end
% export the data
out = [date mean_data];
writecell(out,'toto.xlsx');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [begin,ends] = find_start_end_group(ind)
% This locates the beginning /ending points of data groups
% Important : ind must be a LOGICAL array
D = diff([0;ind(:);0]);
begin = find(D == 1);
ends = find(D == -1) - 1;
end

4 Commenti

Hi Matthieu,
I have tried this and it works but in the time column of the data it produces theres times of up to 36:00:00 which obviously isn't a real time, see the attached image.
This issue occurs as soon as h is produced?
Thanks a lot for your help!
hello Bradley
I don't have this problem on my side , I supposed it's maybe due how my (french) excel reads your file
as I use xlsread, the date comes out as a fraction of 24 hours (the first date is 0.4701 => 11.2824 hours or 11:16:58 as in your excel file
what are your first column of y when your read your data ?
y = xlsread('Data.xlsx');
Thanks Mathieu. If you see above Star Strider has answered this in a more succint way.
I know I would not win this time (again !)

Accedi per commentare.

Prodotti

Release

R2023b

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by