MATLAB Answers

Sorting List of Filenames according to Windows

46 views (last 30 days)
Simone Croci
Simone Croci on 25 Sep 2020
Commented: Walter Roberson on 7 Oct 2020 at 10:35
I have a list of files inside a directory and I would like to sort them with Matlab like Windows sorts the files.
For example, this is a list of files sorted according to Windows:
  • a_.txt
  • a1.txt
  • a2.txt
While this is the list sorted according to the Matlab sort function:
list = {'a_.txt'; 'a1.txt'; 'a2.txt'};
sort(list)
  • a1.txt
  • a2.txt
  • a_.txt
How can I sort the files in Matlab like Windows?

  3 Comments

Rik
Rik on 28 Sep 2020
You could also define your own sorting rules, which can then try to implement. (e.g. [a-zA-Z] comes after special symbols and numbers should be sorted such that 8 is before 10)
Stephen Cobeldick
Stephen Cobeldick on 28 Sep 2020
@Rik: note that the "special symbols" are not sorted into character code order by Windows Explorer (which is presumably what the OP is referring to by the term "Windows"). This is the order that I get using single-character folder names in Win10:
>> +'''-!#$%&(),;@[]^_`{}~+=' % Folder names copied by hand using the order shown by File Explorer.
ans =
39 45 33 35 36 37 38 40 41 44 59 64 91 93 94 95 96 123 125 126 43 61
And this is just a subset of the ASCII characters... what about all of the non-alphabetic Unicode "special symbols", where is this order defined?
Rik
Rik on 28 Sep 2020
I didn't mean to suggest that was a complete description of how Windows Explorer sorts file names. I didn't even mean to suggest it was an incomplete description. I meant it as an example of how you could describe a sort order.

Sign in to comment.

Answers (2)

Walter Roberson
Walter Roberson on 25 Sep 2020
Windows explicitly does not define a sorting order for files. It leaves it up to the filesystems -- so the sort order for FAT32 might be different than NTFS for example.
I have not come across any formal definition of the sort order for NTFS.
I find a claim that NTFS captures some case-mapping information from the locale of the user at the time the disk is formatted, https://devblogs.microsoft.com/oldnewthing/20050617-10/?p=35293 and uses that, and otherwise sorts by UNICODE code point. MATLAB sorts strictly by UNICODE code point.
The implication is that for the same list of files on two different NTFS drives, the sort order could be different.
Some people do encounter _ sorting before letters, but other people do not encounter that. https://stackoverflow.com/questions/29734737/findnextfile-order-ntfs

  16 Comments

Walter Roberson
Walter Roberson on 2 Oct 2020 at 9:38
How does the application sort the files according to Windows File Explorer order? That order is not the same as the application would be told when it asks the operating system for the directory of files.
Stephen Cobeldick
Stephen Cobeldick on 2 Oct 2020 at 18:02
Knowing the character order is still not sufficient: Windows File Explorer also does some kind of number matching as well, and sorts taking into account the number values. But what counts as a number? Decimal comma or point or both? What effect does the locale have on this? Positive/negative sign? What precision? Is there any other special matching of multiple characters? So many questions...

Sign in to comment.


U Arda Demiro
U Arda Demiro on 7 Oct 2020 at 9:43
Edited: U Arda Demiro on 7 Oct 2020 at 9:45
Apparently someone at Matlab fixed this problem, but only in a way that causes other problems. Here is the problem I am having now. Suppose you have four files as follows:
prog1.m
prog1_.m
prog1_5digit.m
prog1_7digit.m
The above order would be how Windows sorts them. Matlab instead currently sorts them as follows:
prog1_.m
prog1_5digit.m
prog1_7digit.m
prog1.m
Fairly annoying.
(Version R2017b, this message: October 7 2020.)

  1 Comment

Walter Roberson
Walter Roberson on 7 Oct 2020 at 10:35
I have not seen evidence that MATLAB sorts the names it gets back from the file system calls.
Anyhow, the way to get what you are asking for would appear to be to write some mex to use https://docs.microsoft.com/en-ca/windows/win32/api/shlwapi/nf-shlwapi-strcmplogicalw to sort the names.
However, the links I have presented show that the sort order is not constant and can vary according to the user's settings. The order used by your application can be different.
I would recommend repairing the application that depends upon a fragile sort order.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by