Python: convert camel case to space delimited using RegEx and taking Acronyms into account
This should work with 'divLineColor', 'simpleBigURL', 'OldHTMLFile' and 'SQLServer'.
label = re.sub(r'((?<=[a-z])[A-Z]|(?<!\A)[A-Z](?=[a-z]))', r' \1', label)
Explanation:
label = re.sub(r"""
( # start the group
# alternative 1
(?<=[a-z]) # current position is preceded by a lower char
# (positive lookbehind: does not consume any char)
[A-Z] # an upper char
#
| # or
# alternative 2
(?<!\A) # current position is not at the beginning of the string
# (negative lookbehind: does not consume any char)
[A-Z] # an upper char
(?=[a-z]) # matches if next char is a lower char
# lookahead assertion: does not consume any char
) # end the group""",
r' \1', label, flags=re.VERBOSE)
If a match is found it is replaced with ' \1'
, which is a string consisting of a leading blank and the match itself.
Alternative 1 for a match is an upper character, but only if it is preceded by a lower character. We want to translate abYZ
to ab YZ
and not to ab Y Z
.
Alternative 2 for a match is an upper character, but only if it is followed by a lower character and not at the start of the string. We want to translate ABCyz
to AB Cyz
and not to A B Cyz
.
\g<0>
references the matched string of the whole pattern while \g<1>
refereces the matched string of the first subpattern ((…)
). So you should use \g<1>
and \g<2>
instead:
label = re.sub("([a-z])([A-Z])","\g<1> \g<2>",label)
I know, it's not regex. But, you can also use map
like this
>>> s = 'camelCaseTest'
>>> ''.join(map(lambda x: x if x.islower() else " "+x, s))
'camel Case Test'