ZSH wildcard expression limiting repetition support?
Yes, use ##
to match one or more occurrence of [0-9]
like:
ABC[0-9]##
This requires extendedglob
to be set, which is by default. If unset, set it first:
setopt extendedglob
Example:
% print -l ABC*
ABC
ABC75475
ABC8
ABC90
% print -l ABC[0-9]##
ABC75475
ABC8
ABC90
With extendedglob
enabled:
$ setopt extendedglob
$ print -rl -- perl[[:digit:]]##
perl5
or with kshglob
enabled and bareglobqual
disabled:
$ setopt kshglob
$ unsetopt bareglobqual
$ print -rl -- perl+([[:digit:]])
perl5
Note that using [:digit:]
to match everything considered digit in current locale. If you want to match 0
through 9
only, set LC_ALL=C
or using literally [0123456789]
.
You can specify number of match, like regular expression {n,m}
using (#cN,M)
globbing flag anywhere that #
and ##
operators can be used, except in (*/)#
and (*/)##
:
$ print -rl -- perl[[:digit:]](#c1)
perl5
$ print -rl -- perl[[:digit:]](#c2)
zsh: no matches found: perl[[:digit:]](#c2)
Scroll down to “Filename Generation”, and enable either extended_glob
(which should be the default, but isn't for backward compatibility) or ksh_glob
. Both zsh's extended globs and ksh's have the full power of regular expressions.
ERE syntax ksh glob zsh extended glob
(foo)* *(foo) (foo)#
(foo)+ +(foo) (foo)##
(foo)? ?(foo) (|foo)
(foo|bar) @(foo|bar) (foo|bar)
Keep in mind that most tools that use regular expressions use them as search patterns that must match a substring, but globs are always used as patterns that must match the whole string. For example foofoobar
does not match the zsh glob (foo)##
, since after foofoo
there's some text left over.
Zsh has additional operators that don't extend the expressive power but make some expressions easier to write. The operators ^
and ~
(extended_glob
) and !(…)
(ksh_glob
) provide negation, e.g. ^foo
or !(foo)
matches anything except foo
, which in regex syntax requires the unwieldy |[^f].*|f[^o].*|fo[^o].*
. The operator <…-…>
(zsh-specific, does not require extended_glob
) matches any integer (in decimal notation) in a range, e.g. <3-11>
matches 3
and 10
but not 30
or 1
.
So, excluding issues like unprintable characters, ls | grep -e "ABC[0-9]\+"
can be written in ways such as
print -lr -- *ABC[0-9]##*(N) # requires extended_glob
print -lr -- *ABC+([0-9])*(N) # requires ksh_glob
print -lr -- *ABC+<->(N)
But since “one or more digit then anything” is equivalent to “a digit then anything”, it can also be written
print -lr -- *ABC[0-9]*(N)