Is it safe to use `strstr` to search for multibyte UTF-8 characters in a string?
Edit
Based on updated question from OP that "can such false positive exist in an UTF-8 context"
So the answer is UTF-8 is designed in such a way that it is immune to partial mismatch of character as shown above and cause any false positive. So it is completely safe to use strstr
with UTF-8 coded multibyte characters.
Original Answer
No strstr
is not suitable for strings containing multi-byte characters.
If you are searching for a string that doesn't contain multi-byte character inside a string that contains multi-byte character, it may give false positive. (While using shift-jis encoding in japanese locale, strstr("掘something", "@some") may give false positive)
+---------+----+----+----+
| c1 | c2 | c3 | c4 | <--- string
+---------+----+----+----+
+----+----+----+
| c5 | c2 | c3 | <--- string to search
+----+----+----+
If trailing part of c1 (accidentally) matches with c5, you may get incorrect result. I would suggest using unicode with unicode substring check function or multibyte substring check functions. (_mbsstr for example)