Regex - finding capital words in string
The last letter of the match is in group because of inner parentheses. Just drop those and you'll be fine.
>>> t = re.findall('([A-Z][a-z]+)', line)
>>> t
['Cow', 'Apple', 'Woof']
>>> t = re.findall('([A-Z]([a-z])+)', line)
>>> t
[('Cow', 'w'), ('Apple', 'e'), ('Woof', 'f')]
The count of capitalised words is, of course, len(t)
.
I use the findall
function to find all instances that match the regex. The use len
to see how many matches there are, in this case, it prints out 3
. You can check if the length is greater than 2 and return a True
or False
.
import re
line = 'Cow Apple think Woof'
test = re.findall(r'(\b[A-Z]([a-z])*\b)',line)
print(len(test) >= 2)
If you want to use only regex, you can search for a capitalized word then some characters in between and another capitalized word.
test = re.search(r'(\b[A-Z][a-z]*\b)(.*)(\b[A-Z][a-z]*\b)',line)
print(bool(test))
(\b[A-Z][a-z]*\b)
- finds a capitalized word(.*)
- matches 0 or more characters(\b[A-Z][a-z]*\b)
- finds the second capitalized word
This method isn't as dynamical since it will not work for trying to match 3 capitalized word.
import re
sent = "His email is [email protected], however his wife uses [email protected]"
x = re.findall('[A-Za-z]+@[A-Za-z\.]+', sent)
print(x)
If there is a period at the end of an email ID (abc@some,com.), it will be returned at the end of the email address. However, this can be dealt separately.