How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?
Try this:
$str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}, $str);
In case it's UTF-16 based C/C++/Java/Json-style:
$str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UTF-16BE');
}, $str);
print_r(json_decode('{"t":"\u00ed"}')); // -> stdClass Object ( [t] => í )
PHP 7+
As of PHP 7, you can use the Unicode codepoint escape syntax to do this.
echo "\u{00ed}";
outputs í
.