当我看到由ls
以奇怪的顺序列出的以下文件时,我一直对此感到困惑:
Star Wars Episode II - Attack of the Clones (2002) BDRip.mkv Star Wars Episode III - Revenge of the Sith (2005) BDRip.mkv Star Wars Episode I - The Phantom Menace (1999) BDRip.mkv Star Wars Episode IV - A New Hope (1977) BDRip.mkv Star Wars Episode VI - Return of the Jedi (1983) BDRip.mkv Star Wars Episode V - The Empire Strikes Back (1980) BDRip.mkv
从人的angular度来看,“我”应该先走,然后是“二”,等等。
所以我创build了以下内容的文件:
$ cat 1 Star Wars Episode II - Attack Star Wars Episode III - Revenge Star Wars Episode I - The Star Wars Episode IV - A Star Wars Episode VI - Return Star Wars Episode V - The
如果我sorting它给了我这个:
$ sort 1 Star Wars Episode II - Attack Star Wars Episode III - Revenge Star Wars Episode I - The Star Wars Episode IV - A Star Wars Episode VI - Return Star Wars Episode V - The
但是,如果我删除“ – ”,并且正确的后面的所有内容都正确:
$ cat 1 Star Wars Episode II Star Wars Episode III Star Wars Episode I Star Wars Episode IV Star Wars Episode VI Star Wars Episode V $ sort 1 Star Wars Episode I Star Wars Episode II Star Wars Episode III Star Wars Episode IV Star Wars Episode V Star Wars Episode VI
所以,只要在空格之后添加任何符号,它就开始对我进行不可预知的sorting:
$ cat 1 Star Wars Episode II y Star Wars Episode III x Star Wars Episode I z Star Wars Episode IV w Star Wars Episode VI v Star Wars Episode V u $ sort 1 Star Wars Episode III x Star Wars Episode II y Star Wars Episode IV w Star Wars Episode I z Star Wars Episode VI v Star Wars Episode V u
这种行为的任何暗示?
更新:sorting:使用“en_CA.UTF-8”sorting规则
根据下面的评论更新#2是因为语言环境。
ls | LANG=C sort Star Wars Episode I - The Phantom Menace (1999) BDRip.mkv Star Wars Episode II - Attack of the Clones (2002) BDRip.mkv Star Wars Episode III - Revenge of the Sith (2005) BDRip.mkv Star Wars Episode IV - A New Hope (1977) BDRip.mkv Star Wars Episode V - The Empire Strikes Back (1980) BDRip.mkv Star Wars Episode VI - Return of the Jedi (1983) BDRip.mkv
为什么UTF8 locale使它不同? 我检查了ru_RU.UTF8(不正确的sorting)和ru_RU.KOI8-R(正确的sorting)
更新#3它是关于语言环境: http : //www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021
我想我找到了适当的解释:
Gnu coreutils FAQ:排序不按正常顺序排序
找到它: 排序不按预期排序(空间和区域设置)
在使用基于语言环境的排序时,它会忽略所有非字母数字字符:
II - Attack -> "IIA" III - Revenge -> "III" I - The -> "ITh" IV - A -> "IVA" VI - Return -> "VIR" V - The -> "VTh"
LC_ALL=C
,空格字符排序在字母数字前面:
I - The -> "I -" II - Attack -> "II " III - Revenge -> "III" IV - A -> "IV " V - The -> "V -" VI - Return -> "VI "
所以这是巧合的,但它需要30多个电影才能真正失败。