Why Normalizer::normalize (PHP) doesn't work?
Found on this page: (the linked document has different wording, the old one never exists anymore)
Unicode and internationalization is a large topic, but you should know at least one more important thing. For historical reasons, Unicode allows alternative representations of some characters. For example, á can be written either as one precomposed character á with the Unicode code point U+00E1 or as a decomposed sequence of the letter a (U+0061) combined with the accent ´ (U+0301). For purposes of comparison and sorting, two such representations should be taken as equal. To solve this, the intl library provides the Normalizer class. This class in turn provides the normalize() method, which you can use to convert a string to a normalized composed or decomposed form. Your application should consistently transform all strings to one or the other form before performing comparisons.
echo Normalizer::normalize("a´", Normalizer::FORM_C); // á
echo Normalizer::normalize("á", Normalizer::FORM_D); // a´
So eliminating accents (and similar) is not the purpose of Normalizer
.
Normalizer
with FORM_D
can split the diacritics out from the base characters, then preg_replace
can eliminate the diacritics:
$string = 'áéíóú';
echo preg_replace('/[\x{0300}-\x{036f}]/u', "", Normalizer::normalize($string , Normalizer::FORM_D));
//aeiou
What you are looking for is iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text)
.
http://php.net/manual/function.iconv.php
Be careful with LC_*
settings! Depending on the setting the transliteration might change.