Why should 'Character Classes' be preferred over 'Character Ranges' In Shell (Bash)?
According to the bash
manpage, the LC_COLLATE
environment variable affects character ranges, exactly as per Hauke Laging's answer:
LC_COLLATE This variable determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.
On the other hand, LC_CTYPE
affects character classes:
LC_CTYPE This variable determines the interpretation of characters and the behavior of character classes within pathname expansion and pattern matching.
What this means is that both cases are potentially problematic if you're thinking in a English, left-to-right, Latin alphabet, Arabic-digit context.
If you're really proper, and/or are scripting for a multi-locale environment, it's probably best to make sure you know what your locale variables are when you're matching files, or to be sure that you're coding in a completely generic way.
It's very difficult to foresee some situations though, unless you've studied linguistics.
However, I don't know of a Latin-using locale that changes the order of letters, so [a-z] would work. There are extensions to the Latin alphabet that collate ligatures and diacriticals differently. However, here's a little experiment:
mkdir /tmp/test
cd /tmp/test
export LC_CTYPE=de_DE.UTF-8
export LC_COLLATE=de_DE.UTF-8
touch Grüßen
ls G* # This says ‘Grüßen’
ls *[a-z]en # This says nothing!
ls *[a-zß]en # This says ‘Grüßen’
ls Gr[a-z]*en # This says nothing!
This is interesting: at least for German, neither diacriticals like ü nor ligatures like ß are folded into latin characters. (either that, or I messed up the locale change!)
This may be bad for you, of course, if you're trying to find filenames that start with a letter, use [a-z]*
and apply it to a file that starts with ‘Ä’.