A very fast way to sort datetime (in "ascend" mode)?
12 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
A very fast way to sort datetime (in "ascend" mode)?
Here an example:
a = datetime([
'2022-10-27 00:22:50.000'
'2022-10-27 05:29:45.000'
'2022-10-27 05:32:19.400'
'2022-10-27 05:36:44.100'
'2022-10-27 05:39:26.600'
'2022-10-27 05:43:18.200'
'2022-10-27 05:49:17.400'
'2022-10-27 05:55:27.300'
'2022-10-27 05:58:07.000'
'2022-10-27 06:17:13.800'
'2022-10-27 06:41:28.700'
'2022-10-27 07:03:06.000'
'2022-10-27 07:09:49.800'
'2022-10-27 07:17:39.700'
'2022-10-27 07:35:09.000'
'2022-10-27 07:42:33.600'
'2022-10-27 07:46:50.500'
'2022-10-27 08:07:02.700'
'2022-10-27 08:26:29.600'
'2022-10-27 08:45:03.500'
'2022-10-27 08:48:09.700'
'2022-10-27 08:53:57.000'
'2022-10-27 08:59:43.400'
'2022-10-27 09:13:24.100'
'2022-10-27 09:19:25.000'
'2022-10-27 09:26:35.000'
'2022-10-27 09:29:54.000'
'2022-10-27 09:46:45.700'
'2022-10-27 10:10:59.900'
'2022-10-27 10:29:04.600'
'2022-10-27 10:31:50.100'
'2022-10-27 10:37:45.300'
'2022-10-27 10:42:35.800'
'2022-10-27 10:58:42.300'
'2022-10-27 11:03:48.900'
'2022-10-27 11:10:44.700'
'2022-10-27 11:13:11.100'
'2022-10-27 11:31:25.100'
'2022-10-27 11:56:01.200'
'2022-10-27 12:19:25.300'
'2022-10-27 12:25:12.500'
'2022-10-27 12:30:16.900'
'2022-10-27 12:49:19.100'
'2022-10-27 12:55:42.700'
'2022-10-27 12:58:18.100'
'2022-10-27 13:16:06.300'
'2022-10-27 14:00:44.700'
'2022-10-27 14:04:14.200'
'2022-10-27 14:11:54.600'
'2022-10-27 14:17:10.000'
'2022-10-27 14:42:03.300'
'2022-10-27 14:45:22.100'
'2022-10-27 15:02:07.900'
'2022-10-27 15:25:39.600'
'2022-10-27 15:44:58.300'
'2022-10-27 15:48:35.800'
'2022-10-27 15:54:23.100'
'2022-10-27 16:00:17.300'
'2022-10-27 16:19:23.700'
'2022-10-27 16:27:30.800'
'2022-10-27 16:31:30.600'
'2022-10-27 16:52:09.700'
'2022-10-27 17:16:08.800'
'2022-10-27 18:01:39.400'
'2022-10-27 18:08:24.800'
'2022-10-27 18:17:44.500'
'2022-10-27 18:25:02.500'
'2022-10-27 18:27:45.700'
'2022-10-27 18:48:32.000'
'2022-10-27 19:02:01.700'
'2022-10-27 19:24:08.300'
'2022-10-27 19:30:01.400'
'2022-10-27 19:43:02.200'
'2022-10-27 19:48:50.300'
'2022-10-27 19:55:41.900'
'2022-10-27 19:58:23.700'
'2022-10-27 20:17:15.800'
'2022-10-27 20:28:47.600'
'2022-10-27 20:49:36.900'
'2022-10-27 20:53:02.700'
'2022-10-27 21:07:12.100'
'2022-10-27 21:09:47.800'
'2022-10-27 21:50:42.300'
'2022-10-27 22:07:34.100'
'2022-10-27 22:09:18.800'])
% Any way faster than this one ?
tic
[~, idx2] = sort( datenum(a), 1, 'ascend');
toc
0 Commenti
Risposta accettata
Stephen23
il 18 Nov 2022
"Any way faster than this one"
Interestingly, sorting seems to be slightly faster with the superfluous call to DATENUM:
a = datetime(['2022-10-27 00:22:50.000';'2022-10-27 05:29:45.000';'2022-10-27 05:32:19.400';'2022-10-27 05:36:44.100';'2022-10-27 05:39:26.600';'2022-10-27 05:43:18.200';'2022-10-27 05:49:17.400';'2022-10-27 05:55:27.300';'2022-10-27 05:58:07.000';'2022-10-27 06:17:13.800';'2022-10-27 06:41:28.700';'2022-10-27 07:03:06.000';'2022-10-27 07:09:49.800';'2022-10-27 07:17:39.700';'2022-10-27 07:35:09.000';'2022-10-27 07:42:33.600';'2022-10-27 07:46:50.500';'2022-10-27 08:07:02.700';'2022-10-27 08:26:29.600';'2022-10-27 08:45:03.500';'2022-10-27 08:48:09.700';'2022-10-27 08:53:57.000';'2022-10-27 08:59:43.400';'2022-10-27 09:13:24.100';'2022-10-27 09:19:25.000';'2022-10-27 09:26:35.000';'2022-10-27 09:29:54.000';'2022-10-27 09:46:45.700';'2022-10-27 10:10:59.900';'2022-10-27 10:29:04.600';'2022-10-27 10:31:50.100';'2022-10-27 10:37:45.300';'2022-10-27 10:42:35.800';'2022-10-27 10:58:42.300';'2022-10-27 11:03:48.900';'2022-10-27 11:10:44.700';'2022-10-27 11:13:11.100';'2022-10-27 11:31:25.100';'2022-10-27 11:56:01.200';'2022-10-27 12:19:25.300';'2022-10-27 12:25:12.500';'2022-10-27 12:30:16.900';'2022-10-27 12:49:19.100';'2022-10-27 12:55:42.700';'2022-10-27 12:58:18.100';'2022-10-27 13:16:06.300';'2022-10-27 14:00:44.700';'2022-10-27 14:04:14.200';'2022-10-27 14:11:54.600';'2022-10-27 14:17:10.000';'2022-10-27 14:42:03.300';'2022-10-27 14:45:22.100';'2022-10-27 15:02:07.900';'2022-10-27 15:25:39.600';'2022-10-27 15:44:58.300';'2022-10-27 15:48:35.800';'2022-10-27 15:54:23.100';'2022-10-27 16:00:17.300';'2022-10-27 16:19:23.700';'2022-10-27 16:27:30.800';'2022-10-27 16:31:30.600';'2022-10-27 16:52:09.700';'2022-10-27 17:16:08.800';'2022-10-27 18:01:39.400';'2022-10-27 18:08:24.800';'2022-10-27 18:17:44.500';'2022-10-27 18:25:02.500';'2022-10-27 18:27:45.700';'2022-10-27 18:48:32.000';'2022-10-27 19:02:01.700';'2022-10-27 19:24:08.300';'2022-10-27 19:30:01.400';'2022-10-27 19:43:02.200';'2022-10-27 19:48:50.300';'2022-10-27 19:55:41.900';'2022-10-27 19:58:23.700';'2022-10-27 20:17:15.800';'2022-10-27 20:28:47.600';'2022-10-27 20:49:36.900';'2022-10-27 20:53:02.700';'2022-10-27 21:07:12.100';'2022-10-27 21:09:47.800';'2022-10-27 21:50:42.300';'2022-10-27 22:07:34.100';'2022-10-27 22:09:18.800'])
timeit(@()sort(a))
timeit(@()sort(datenum(a)))
Lets try some other numeric formats, which might be more accurate over the dates in question:
timeit(@()sort(juliandate(a)))
timeit(@()sort(exceltime(a)))
timeit(@()sort(posixtime(a)))
3 Commenti
Bruno Luong
il 18 Nov 2022
Modificato: Bruno Luong
il 18 Nov 2022
idx2 should be the same, but I think timeit() results without second output could not be representative as the algorithm might skip some branching.
The relative result also might change with the size of your data.
Più risposte (2)
Peter Perkins
il 19 Nov 2022
It seems like the obvious response to
Any way faster than this one ?
[~, idx2] = sort(datenum(a), 1, 'ascend');
is to not pass in parameters that are the defaults. That cuts the time by not quite a factor of two for me. But maybe in your real code you need to pass in non-defaults.
Three things:
1) On my machine, the difference between these two small sorts
[~, idx2] = sort(a, 1, 'ascend');
[~, idx2] = sort(datenum(a), 1, 'ascend');
is literally a few microseconds. The usual response is, "Sure, but I'm doing this in a loop, so a few microseconds adds up". It would help to hear the details of that context.
2) Sure, for small sorts like your example there is a difference. But by the time you have 10000 elements
a = datetime(2022,1,1,0,0,randi(10000,10000,1));
the difference between the two is gone. And the total time is less than .5ms.
3) ALL of those conversions through away precision. In your example, you have resolution of at most .1s, so precision is probably not an issue. But in general, converting datetime to those numeric formats is A BAD IDEA.
2 Commenti
Bruno Luong
il 22 Nov 2022
"But in general, converting datetime to those numeric formats is A BAD IDEA."
That's true.
For decades a resolution of 53-bit datenum is the only option, whereas in C 128-bit resolution is used for time enumeration on other languages.
Bruno Luong
il 18 Nov 2022
Modificato: Bruno Luong
il 18 Nov 2022
The first peak is due to JIT that is not kicks in.
Here is the result on my PC (code attached)
I have to chase what create the spikes every 33 iterations.
Vedere anche
Categorie
Scopri di più su Dates and Time in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!