A very fast way to sort datetime (in "ascend" mode)?

12 visualizzazioni (ultimi 30 giorni)
A very fast way to sort datetime (in "ascend" mode)?
Here an example:
a = datetime([
'2022-10-27 00:22:50.000'
'2022-10-27 05:29:45.000'
'2022-10-27 05:32:19.400'
'2022-10-27 05:36:44.100'
'2022-10-27 05:39:26.600'
'2022-10-27 05:43:18.200'
'2022-10-27 05:49:17.400'
'2022-10-27 05:55:27.300'
'2022-10-27 05:58:07.000'
'2022-10-27 06:17:13.800'
'2022-10-27 06:41:28.700'
'2022-10-27 07:03:06.000'
'2022-10-27 07:09:49.800'
'2022-10-27 07:17:39.700'
'2022-10-27 07:35:09.000'
'2022-10-27 07:42:33.600'
'2022-10-27 07:46:50.500'
'2022-10-27 08:07:02.700'
'2022-10-27 08:26:29.600'
'2022-10-27 08:45:03.500'
'2022-10-27 08:48:09.700'
'2022-10-27 08:53:57.000'
'2022-10-27 08:59:43.400'
'2022-10-27 09:13:24.100'
'2022-10-27 09:19:25.000'
'2022-10-27 09:26:35.000'
'2022-10-27 09:29:54.000'
'2022-10-27 09:46:45.700'
'2022-10-27 10:10:59.900'
'2022-10-27 10:29:04.600'
'2022-10-27 10:31:50.100'
'2022-10-27 10:37:45.300'
'2022-10-27 10:42:35.800'
'2022-10-27 10:58:42.300'
'2022-10-27 11:03:48.900'
'2022-10-27 11:10:44.700'
'2022-10-27 11:13:11.100'
'2022-10-27 11:31:25.100'
'2022-10-27 11:56:01.200'
'2022-10-27 12:19:25.300'
'2022-10-27 12:25:12.500'
'2022-10-27 12:30:16.900'
'2022-10-27 12:49:19.100'
'2022-10-27 12:55:42.700'
'2022-10-27 12:58:18.100'
'2022-10-27 13:16:06.300'
'2022-10-27 14:00:44.700'
'2022-10-27 14:04:14.200'
'2022-10-27 14:11:54.600'
'2022-10-27 14:17:10.000'
'2022-10-27 14:42:03.300'
'2022-10-27 14:45:22.100'
'2022-10-27 15:02:07.900'
'2022-10-27 15:25:39.600'
'2022-10-27 15:44:58.300'
'2022-10-27 15:48:35.800'
'2022-10-27 15:54:23.100'
'2022-10-27 16:00:17.300'
'2022-10-27 16:19:23.700'
'2022-10-27 16:27:30.800'
'2022-10-27 16:31:30.600'
'2022-10-27 16:52:09.700'
'2022-10-27 17:16:08.800'
'2022-10-27 18:01:39.400'
'2022-10-27 18:08:24.800'
'2022-10-27 18:17:44.500'
'2022-10-27 18:25:02.500'
'2022-10-27 18:27:45.700'
'2022-10-27 18:48:32.000'
'2022-10-27 19:02:01.700'
'2022-10-27 19:24:08.300'
'2022-10-27 19:30:01.400'
'2022-10-27 19:43:02.200'
'2022-10-27 19:48:50.300'
'2022-10-27 19:55:41.900'
'2022-10-27 19:58:23.700'
'2022-10-27 20:17:15.800'
'2022-10-27 20:28:47.600'
'2022-10-27 20:49:36.900'
'2022-10-27 20:53:02.700'
'2022-10-27 21:07:12.100'
'2022-10-27 21:09:47.800'
'2022-10-27 21:50:42.300'
'2022-10-27 22:07:34.100'
'2022-10-27 22:09:18.800'])
a = 85×1 datetime array
27-Oct-2022 00:22:50 27-Oct-2022 05:29:45 27-Oct-2022 05:32:19 27-Oct-2022 05:36:44 27-Oct-2022 05:39:26 27-Oct-2022 05:43:18 27-Oct-2022 05:49:17 27-Oct-2022 05:55:27 27-Oct-2022 05:58:07 27-Oct-2022 06:17:13 27-Oct-2022 06:41:28 27-Oct-2022 07:03:06 27-Oct-2022 07:09:49 27-Oct-2022 07:17:39 27-Oct-2022 07:35:09 27-Oct-2022 07:42:33 27-Oct-2022 07:46:50 27-Oct-2022 08:07:02 27-Oct-2022 08:26:29 27-Oct-2022 08:45:03 27-Oct-2022 08:48:09 27-Oct-2022 08:53:57 27-Oct-2022 08:59:43 27-Oct-2022 09:13:24 27-Oct-2022 09:19:25 27-Oct-2022 09:26:35 27-Oct-2022 09:29:54 27-Oct-2022 09:46:45 27-Oct-2022 10:10:59 27-Oct-2022 10:29:04
% Any way faster than this one ?
tic
[~, idx2] = sort( datenum(a), 1, 'ascend');
toc
Elapsed time is 0.008225 seconds.

Risposta accettata

Stephen23
Stephen23 il 18 Nov 2022
"Any way faster than this one"
Interestingly, sorting seems to be slightly faster with the superfluous call to DATENUM:
a = datetime(['2022-10-27 00:22:50.000';'2022-10-27 05:29:45.000';'2022-10-27 05:32:19.400';'2022-10-27 05:36:44.100';'2022-10-27 05:39:26.600';'2022-10-27 05:43:18.200';'2022-10-27 05:49:17.400';'2022-10-27 05:55:27.300';'2022-10-27 05:58:07.000';'2022-10-27 06:17:13.800';'2022-10-27 06:41:28.700';'2022-10-27 07:03:06.000';'2022-10-27 07:09:49.800';'2022-10-27 07:17:39.700';'2022-10-27 07:35:09.000';'2022-10-27 07:42:33.600';'2022-10-27 07:46:50.500';'2022-10-27 08:07:02.700';'2022-10-27 08:26:29.600';'2022-10-27 08:45:03.500';'2022-10-27 08:48:09.700';'2022-10-27 08:53:57.000';'2022-10-27 08:59:43.400';'2022-10-27 09:13:24.100';'2022-10-27 09:19:25.000';'2022-10-27 09:26:35.000';'2022-10-27 09:29:54.000';'2022-10-27 09:46:45.700';'2022-10-27 10:10:59.900';'2022-10-27 10:29:04.600';'2022-10-27 10:31:50.100';'2022-10-27 10:37:45.300';'2022-10-27 10:42:35.800';'2022-10-27 10:58:42.300';'2022-10-27 11:03:48.900';'2022-10-27 11:10:44.700';'2022-10-27 11:13:11.100';'2022-10-27 11:31:25.100';'2022-10-27 11:56:01.200';'2022-10-27 12:19:25.300';'2022-10-27 12:25:12.500';'2022-10-27 12:30:16.900';'2022-10-27 12:49:19.100';'2022-10-27 12:55:42.700';'2022-10-27 12:58:18.100';'2022-10-27 13:16:06.300';'2022-10-27 14:00:44.700';'2022-10-27 14:04:14.200';'2022-10-27 14:11:54.600';'2022-10-27 14:17:10.000';'2022-10-27 14:42:03.300';'2022-10-27 14:45:22.100';'2022-10-27 15:02:07.900';'2022-10-27 15:25:39.600';'2022-10-27 15:44:58.300';'2022-10-27 15:48:35.800';'2022-10-27 15:54:23.100';'2022-10-27 16:00:17.300';'2022-10-27 16:19:23.700';'2022-10-27 16:27:30.800';'2022-10-27 16:31:30.600';'2022-10-27 16:52:09.700';'2022-10-27 17:16:08.800';'2022-10-27 18:01:39.400';'2022-10-27 18:08:24.800';'2022-10-27 18:17:44.500';'2022-10-27 18:25:02.500';'2022-10-27 18:27:45.700';'2022-10-27 18:48:32.000';'2022-10-27 19:02:01.700';'2022-10-27 19:24:08.300';'2022-10-27 19:30:01.400';'2022-10-27 19:43:02.200';'2022-10-27 19:48:50.300';'2022-10-27 19:55:41.900';'2022-10-27 19:58:23.700';'2022-10-27 20:17:15.800';'2022-10-27 20:28:47.600';'2022-10-27 20:49:36.900';'2022-10-27 20:53:02.700';'2022-10-27 21:07:12.100';'2022-10-27 21:09:47.800';'2022-10-27 21:50:42.300';'2022-10-27 22:07:34.100';'2022-10-27 22:09:18.800'])
a = 85×1 datetime array
27-Oct-2022 00:22:50 27-Oct-2022 05:29:45 27-Oct-2022 05:32:19 27-Oct-2022 05:36:44 27-Oct-2022 05:39:26 27-Oct-2022 05:43:18 27-Oct-2022 05:49:17 27-Oct-2022 05:55:27 27-Oct-2022 05:58:07 27-Oct-2022 06:17:13 27-Oct-2022 06:41:28 27-Oct-2022 07:03:06 27-Oct-2022 07:09:49 27-Oct-2022 07:17:39 27-Oct-2022 07:35:09 27-Oct-2022 07:42:33 27-Oct-2022 07:46:50 27-Oct-2022 08:07:02 27-Oct-2022 08:26:29 27-Oct-2022 08:45:03 27-Oct-2022 08:48:09 27-Oct-2022 08:53:57 27-Oct-2022 08:59:43 27-Oct-2022 09:13:24 27-Oct-2022 09:19:25 27-Oct-2022 09:26:35 27-Oct-2022 09:29:54 27-Oct-2022 09:46:45 27-Oct-2022 10:10:59 27-Oct-2022 10:29:04
timeit(@()sort(a))
ans = 1.3426e-05
timeit(@()sort(datenum(a)))
ans = 4.8560e-06
Lets try some other numeric formats, which might be more accurate over the dates in question:
timeit(@()sort(juliandate(a)))
ans = 6.9260e-06
timeit(@()sort(exceltime(a)))
ans = 4.4260e-06
timeit(@()sort(posixtime(a)))
ans = 3.2285e-06
  3 Commenti
Bruno Luong
Bruno Luong il 18 Nov 2022
Modificato: Bruno Luong il 18 Nov 2022
idx2 should be the same, but I think timeit() results without second output could not be representative as the algorithm might skip some branching.
The relative result also might change with the size of your data.
Sim
Sim il 18 Nov 2022
Modificato: Sim il 18 Nov 2022
Thanks a lot @Stephen23 and @Bruno Luong !!!
Just for curiosity, I did a test with 1000 iterations on my machine. Except an initial spike in the calculation time (I do not know why), it looks like that
sort(posixtime(a))
Unrecognized function or variable 'a'.
is the fastest solution to sort datetime arrays.
Please see here below the test:
for i = 1 : 1000
a = datetime([
'2022-10-27 00:22:50.000'
'2022-10-27 05:29:45.000'
'2022-10-27 05:32:19.400'
'2022-10-27 05:36:44.100'
'2022-10-27 05:39:26.600'
'2022-10-27 05:43:18.200'
'2022-10-27 05:49:17.400'
'2022-10-27 05:55:27.300'
'2022-10-27 05:58:07.000'
'2022-10-27 06:17:13.800'
'2022-10-27 06:41:28.700'
'2022-10-27 07:03:06.000'
'2022-10-27 07:09:49.800'
'2022-10-27 07:17:39.700'
'2022-10-27 07:35:09.000'
'2022-10-27 07:42:33.600'
'2022-10-27 07:46:50.500'
'2022-10-27 08:07:02.700'
'2022-10-27 08:26:29.600'
'2022-10-27 08:45:03.500'
'2022-10-27 08:48:09.700'
'2022-10-27 08:53:57.000'
'2022-10-27 08:59:43.400'
'2022-10-27 09:13:24.100'
'2022-10-27 09:19:25.000'
'2022-10-27 09:26:35.000'
'2022-10-27 09:29:54.000'
'2022-10-27 09:46:45.700'
'2022-10-27 10:10:59.900'
'2022-10-27 10:29:04.600'
'2022-10-27 10:31:50.100'
'2022-10-27 10:37:45.300'
'2022-10-27 10:42:35.800'
'2022-10-27 10:58:42.300'
'2022-10-27 11:03:48.900'
'2022-10-27 11:10:44.700'
'2022-10-27 11:13:11.100'
'2022-10-27 11:31:25.100'
'2022-10-27 11:56:01.200'
'2022-10-27 12:19:25.300'
'2022-10-27 12:25:12.500'
'2022-10-27 12:30:16.900'
'2022-10-27 12:49:19.100'
'2022-10-27 12:55:42.700'
'2022-10-27 12:58:18.100'
'2022-10-27 13:16:06.300'
'2022-10-27 14:00:44.700'
'2022-10-27 14:04:14.200'
'2022-10-27 14:11:54.600'
'2022-10-27 14:17:10.000'
'2022-10-27 14:42:03.300'
'2022-10-27 14:45:22.100'
'2022-10-27 15:02:07.900'
'2022-10-27 15:25:39.600'
'2022-10-27 15:44:58.300'
'2022-10-27 15:48:35.800'
'2022-10-27 15:54:23.100'
'2022-10-27 16:00:17.300'
'2022-10-27 16:19:23.700'
'2022-10-27 16:27:30.800'
'2022-10-27 16:31:30.600'
'2022-10-27 16:52:09.700'
'2022-10-27 17:16:08.800'
'2022-10-27 18:01:39.400'
'2022-10-27 18:08:24.800'
'2022-10-27 18:17:44.500'
'2022-10-27 18:25:02.500'
'2022-10-27 18:27:45.700'
'2022-10-27 18:48:32.000'
'2022-10-27 19:02:01.700'
'2022-10-27 19:24:08.300'
'2022-10-27 19:30:01.400'
'2022-10-27 19:43:02.200'
'2022-10-27 19:48:50.300'
'2022-10-27 19:55:41.900'
'2022-10-27 19:58:23.700'
'2022-10-27 20:17:15.800'
'2022-10-27 20:28:47.600'
'2022-10-27 20:49:36.900'
'2022-10-27 20:53:02.700'
'2022-10-27 21:07:12.100'
'2022-10-27 21:09:47.800'
'2022-10-27 21:50:42.300'
'2022-10-27 22:07:34.100'
'2022-10-27 22:09:18.800']);
t0 = tic;
[~, idx2] = sort( a, 1, 'ascend');
t0_(i) = toc(t0);
t1 = tic;
[~, idx2] = sort( juliandate(a), 1, 'ascend');
t1_(i) = toc(t1);
t2 = tic;
[~, idx2] = sort( datenum(a), 1, 'ascend');
t2_(i) = toc(t2);
t3 = tic;
[~, idx2] = sort( exceltime(a), 1, 'ascend');
t3_(i) = toc(t3);
t4 = tic;
[~, idx2] = sort( posixtime(a), 1, 'ascend');
t4_(i) = toc(t4);
end
hold on
plot(t0_,'linewidth',2,'DisplayName','simple')
plot(t1_,'linewidth',2,'DisplayName','juliandate')
plot(t2_,'linewidth',2,'DisplayName','datenum')
plot(t3_,'linewidth',2,'DisplayName','exceltime')
plot(t4_,'linewidth',2,'DisplayName','posixtime')
hold off
legend
and a zoom of the same figure:

Accedi per commentare.

Più risposte (2)

Peter Perkins
Peter Perkins il 19 Nov 2022
It seems like the obvious response to
Any way faster than this one ?
[~, idx2] = sort(datenum(a), 1, 'ascend');
is to not pass in parameters that are the defaults. That cuts the time by not quite a factor of two for me. But maybe in your real code you need to pass in non-defaults.
Three things:
1) On my machine, the difference between these two small sorts
[~, idx2] = sort(a, 1, 'ascend');
[~, idx2] = sort(datenum(a), 1, 'ascend');
is literally a few microseconds. The usual response is, "Sure, but I'm doing this in a loop, so a few microseconds adds up". It would help to hear the details of that context.
2) Sure, for small sorts like your example there is a difference. But by the time you have 10000 elements
a = datetime(2022,1,1,0,0,randi(10000,10000,1));
the difference between the two is gone. And the total time is less than .5ms.
3) ALL of those conversions through away precision. In your example, you have resolution of at most .1s, so precision is probably not an issue. But in general, converting datetime to those numeric formats is A BAD IDEA.
  2 Commenti
Bruno Luong
Bruno Luong il 22 Nov 2022
"But in general, converting datetime to those numeric formats is A BAD IDEA."
That's true.
For decades a resolution of 53-bit datenum is the only option, whereas in C 128-bit resolution is used for time enumeration on other languages.
Sim
Sim il 30 Nov 2022
Thank you very much @Peter Perkins and @Bruno Luong for your interesting comments!!
(and very sorry for this long delay in replying..)

Accedi per commentare.


Bruno Luong
Bruno Luong il 18 Nov 2022
Modificato: Bruno Luong il 18 Nov 2022
The first peak is due to JIT that is not kicks in.
Here is the result on my PC (code attached)
I have to chase what create the spikes every 33 iterations.
  1 Commento
Sim
Sim il 18 Nov 2022
oh, cool result, it is way much smoother than mine! Many thanks for the code as well! Very interesting :-) (I have used a MacBook Pro with 2.6 GHz 6-Core Intel Core i7 and 16 GB 2400 MHz DDR4)

Accedi per commentare.

Categorie

Scopri di più su Dates and Time in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by