How can I detect laughing words in a string?
try with this pattern:
\b(?:a*(?:ha)+h?|(?:l+o+)+l+)\b
or better if your regex flavour support atomic groups and possessive quantifiers:
\b(?>a*+(?:ha)++h?|(?:l+o+)++l+)\b
\b(a*ha+h[ha]*|o?l+o+l+[ol]*)\b
Matches:
hahahah
haha
lol
loll
loool
looooool
lolololol
lolololololo
ahaha
aaaahahahahahaha
Does not match:
looo
oool
oooo
llll
ha
l
o
lo
ol
ah
aah
aha
kill
lala
haunt
hauha
louol
In Python, I tried to do it in this way:
import re
re.sub(r"\b(?:a{0,2}h{1,2}a{0,2}){2,}h?\b", "<laugh>", "hahahahha! I love laughing")
>> <laugh>! I love laughing
To keep it simple, because the solutions posted may be overly complicated for what you want to do: if the only thing you count as "laughing words" are ha
, haha
, etc. and lol
, lolol
, lololol
, etc., then the following regular expression will be sufficient:
\b(ha)+|l(ol)+\b
This assumes a regex dialect in which \b
represents a word boundary, which you seem to be using.