unix - inconsistent sort behavior -


I have a sample file containg "aA0_-" characters on each single in the single sort by using GNU Sort Returns sort order:

  $ cat / tmp / sample | After adding some other character, we receive a separate order (non-alphanumeric characters have a low priority):  
  sort -   
  $ cat / tmp / sample | Sed 's / $ / x /' | When we insert this character for the beginning, we receive the original sort order:  
  $ cat / tmp / sample. Sed 's / ^ / x /' | What is the explanation of such behavior?  

UPDATE

> When ' z ' and ' Z 'characters are included in the sample, so the result still looks good:

  $ cat / tmp / sample | Sed 's / $ / x /' | Sort 0x ax xx _x -x zx zx  

.. but in the light of the correct answer, that's because all ' ', ' _ 'and' - 'is the white spot in the current location (N_US.UTF-8) and is not ignored in the sorting.

Your local file should have a definition of LC_COLLATE. It determines the sort order of characters, also see the definition of LC_CTYPE, and which characters are classified as 'space'.

If '-' and '_' are categorized as places, then you can search for those results that you have shown.


Comments