不能解释sorting(1)的行为

当我看到由ls以奇怪的顺序列出的以下文件时,我一直对此感到困惑:

 Star Wars Episode II - Attack of the Clones (2002) BDRip.mkv Star Wars Episode III - Revenge of the Sith (2005) BDRip.mkv Star Wars Episode I - The Phantom Menace (1999) BDRip.mkv Star Wars Episode IV - A New Hope (1977) BDRip.mkv Star Wars Episode VI - Return of the Jedi (1983) BDRip.mkv Star Wars Episode V - The Empire Strikes Back (1980) BDRip.mkv 

从人的angular度来看,“我”应该先走,然后是“二”,等等。

所以我创build了以下内容的文件:

 $ cat 1 Star Wars Episode II - Attack Star Wars Episode III - Revenge Star Wars Episode I - The Star Wars Episode IV - A Star Wars Episode VI - Return Star Wars Episode V - The 

如果我sorting它给了我这个:

 $ sort 1 Star Wars Episode II - Attack Star Wars Episode III - Revenge Star Wars Episode I - The Star Wars Episode IV - A Star Wars Episode VI - Return Star Wars Episode V - The 

但是,如果我删除“ – ”,并且正确的后面的所有内容都正确:

 $ cat 1 Star Wars Episode II Star Wars Episode III Star Wars Episode I Star Wars Episode IV Star Wars Episode VI Star Wars Episode V $ sort 1 Star Wars Episode I Star Wars Episode II Star Wars Episode III Star Wars Episode IV Star Wars Episode V Star Wars Episode VI 

所以,只要在空格之后添加任何符号,它就开始对我进行不可预知的sorting:

 $ cat 1 Star Wars Episode II y Star Wars Episode III x Star Wars Episode I z Star Wars Episode IV w Star Wars Episode VI v Star Wars Episode V u $ sort 1 Star Wars Episode III x Star Wars Episode II y Star Wars Episode IV w Star Wars Episode I z Star Wars Episode VI v Star Wars Episode V u 

这种行为的任何暗示?

更新:sorting:使用“en_CA.UTF-8”sorting规则

根据下面的评论更新#2是因为语言环境。

 ls | LANG=C sort Star Wars Episode I - The Phantom Menace (1999) BDRip.mkv Star Wars Episode II - Attack of the Clones (2002) BDRip.mkv Star Wars Episode III - Revenge of the Sith (2005) BDRip.mkv Star Wars Episode IV - A New Hope (1977) BDRip.mkv Star Wars Episode V - The Empire Strikes Back (1980) BDRip.mkv Star Wars Episode VI - Return of the Jedi (1983) BDRip.mkv 

为什么UTF8 locale使它不同? 我检查了ru_RU.UTF8(不正确的sorting)和ru_RU.KOI8-R(正确的sorting)

更新#3它是关于语言环境: http : //www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021

我想我找到了适当的解释:

Gnu coreutils FAQ:排序不按正常顺序排序

找到它: 排序不按预期排序(空间和区域设置)

在使用基于语言环境的排序时,它会忽略所有非字母数字字符:

 II - Attack -> "IIA" III - Revenge -> "III" I - The -> "ITh" IV - A -> "IVA" VI - Return -> "VIR" V - The -> "VTh" 

LC_ALL=C ,空格字符排序在字母数字前面:

 I - The -> "I -" II - Attack -> "II " III - Revenge -> "III" IV - A -> "IV " V - The -> "V -" VI - Return -> "VI " 

所以这是巧合的,但它需要30多个电影才能真正失败。