Regex Problem Group Name Redefinition?
The following answer deals with how to make the above regex work in Python3.
Since the re2 module as suggested by Max would not work in Python3, because of the
NameError: basestring
. Another alternative to this is the regex
module.
regex
module is just an enhanced version of re
with extra added features. This
module also allows to have same group names in the regex.
You can install it via:
sudo pip install regex
And if you have already been using re
or re2
in your program. Just do the following to import regex
module
import regex as re
No, you can't have two groups of the same name, this would somehow defy the purpose, wouldn't it?
What you probably really want is this:
^\s*(?P<NAME>\w\d{7}|R1_(?:\d{6}_){2})(01f\.foo|\.(?:bar|goo|moo|roo))$
I refactored your regex as far as possible. I made the following assumptions:
You want to (correct me if I'm wrong):
- ignore white space at the start of the string
- match either of the following into a group named "NAME":
- a letter followed by 7 digits, or
"R1_"
, and two times (6 digits +"_"
)
- followed by either:
"01f.foo"
or"."
and ("bar"
or"goo"
or"moo"
or"roo"
)
- followed by the end of the string
You could also have meant:
^\s*(?P<NAME>\w\d{7}01f|R1_(?:\d{6}_){2})\.(?:foo|bar|goo|moo|roo)$
Which is:
- ignore white space at the start of the string
- match either of the following into a group named "NAME":
- a letter followed by 7 digits and "01f"
"R1_"
, and two times (6 digits +"_"
)
- a dot
"foo"
,"bar"
,"goo"
,"moo"
or"roo"
- the end of the string
Reusing the same name makes sense in your case, contrary to Tamalak's reply.
Your regex compiles with python2.7 and also re2. Maybe this problem has been resolved.