Sort has weird behaviour
First let me start by saying this oddity does not just apply to openSUSE - I have a different Mint machine that also has the same behaviour. In both a visual file manager and in a terminal, the list of documents does not appear to sort correctly (I assume the graphical version is using sort?). Here is my terminal output (a bit mangled by my mail client, sorry): graham@localhost:~/Documents/Programming> ls F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf graham@localhost:~/Documents/Programming> ls | sort F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf graham@localhost:~/Documents/Programming> ls | sort -f F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf So -f seems to get the order right, but as far as I can see there should be no upper/lower case to sort. These files were downloaded; is it possibly some character set issue where the file names look the same but aren't? How do I tell? Thanks.
What you have found, excepting the '-f' difference, is the brain-dead brokenness of 'sort', at least since the garbage definition of the English locales were incorporated into it. The locale is supposed to make strings sort in a way that is sensible for *people*, but as you can see, the results of the first sorts you show do not make it easy for people to locate items in a "sorted" list. If you remove the string '# -' from the items, does it look sorted now? I seem to remember that the En locales do case folding anyway, which could explain 'f' occurring after 'F' in the outputs. Now, if you want to experiment, just randomly put that same string into your items at various places, and check that it doesn't affect the sort order. The most wonderful case ;-) is to have items prefixed with such strings, and then enjoy looking them up once they are sorted. I don't know what's up with the '-f' difference. -- Robert Webb On Tuesday, August 30, 2022, 04:46:26 PM PDT, Graham Stephens <graham@thestephensdomain.com> wrote: First let me start by saying this oddity does not just apply to openSUSE - I have a different Mint machine that also has the same behaviour. In both a visual file manager and in a terminal, the list of documents does not appear to sort correctly (I assume the graphical version is using sort?). Here is my terminal output (a bit mangled by my mail client, sorry): graham@localhost:~/Documents/Programming> ls F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf graham@localhost:~/Documents/Programming> ls | sort F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf graham@localhost:~/Documents/Programming> ls | sort -f F# - Beginning F 4.0, 2nd Edition.pdf F# - Expert F 4.0, 4th Edition.pdf F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf F# - High Performance.pdf F# - Programming F# 3.0.pdf F# - Real-World Functional Programming With Examples in F# and C# 1st Edition.pdf Fortran - Introduction to Programming using Fortran 95-2003-2008 3rd edition.pdf Fortran - Introduction to Programming with Fortran 3rd edition.pdf Fortran - Self Study F95.pdf So -f seems to get the order right, but as far as I can see there should be no upper/lower case to sort. These files were downloaded; is it possibly some character set issue where the file names look the same but aren't? How do I tell? Thanks.
On Tuesday, August 30, 2022, 06:14:03 PM PDT, Robert Webb <webbdg@verizon.net> wrote:
[...], which could explain 'f' occurring after 'F' in the outputs.
Correction: "'f' occurring *before* 'F'", as in: F# - for Machine Learning - Essentials.pdf F# - Functional Programming Using F#.pdf I'm guessing that the sort does not care, because it is folding case, and maybe it is a stable sort (?), so the result depends on the input order. -- Robert Webb
Show your environment: printf '=== %s=%s\n' LC_COLLATE "$LC_COLLATE" LC_CTYPE "$LC_CTYPE" And "fix" it: LC_COLLATE='C' -- Robert Webb On Tuesday, August 30, 2022, 04:46:26 PM PDT, Graham Stephens <graham@thestephensdomain.com> wrote: In both a visual file manager and in a terminal, the list of documents does not appear to sort correctly
Um, rather: export LC_COLLATE='C' or LC_COLLATE='C' sort [...] -- Robert Webb On Tuesday, August 30, 2022, 08:20:49 PM PDT, Robert Webb <webbdg@verizon.net> wrote: And "fix" it: LC_COLLATE='C'
participants (2)
-
Graham Stephens
-
Robert Webb