Replacing named capturing groups with re.sub
def repl(matchobj):
if matchobj.group(3):
return matchobj.group(1)+matchobj.group(3)
else:
return matchobj.group(1)
my_str = "Here's some <first>sample stuff</first> in the " \
"<second>middle</second> of some other text."
pattern = r'(?P<text>.*?)(?:<(?P<tag>\w+)>(?P<content>.*)</(?P=tag)>|$)'
print re.sub(pattern, repl, my_str)
You can use the call function of re.sub
.
Edit:
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
this will not work as when the last bit of string matches i.e of some other text.
there is \g<text>
defined but no \g<content>
as there is not content.But you still ask re.sub
to do it.So it generates the error.If you use the string "Here's some <first>sample stuff</first> in the <second>middle</second>"
then your print re.sub(pattern,r"\g<text>\g<content>", my_str)
will work as \g<content>
is defined all the time here.