How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?
str.split()
without any arguments splits on runs of whitespace characters:
>>> s = 'I am having a very nice day.'
>>>
>>> len(s.split())
7
From the linked documentation:
If sep is not specified or is
None
, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.
You can use regex.findall()
:
import re
line = " I am having a very nice day."
count = len(re.findall(r'\w+', line))
print (count)
s = "I am having a very nice 23!@$ day. "
sum([i.strip(string.punctuation).isalpha() for i in s.split()])
The statement above will go through each chunk of text and remove punctuations before verifying if the chunk is really string of alphabets.