How to best configure PHP to handle a UTF-8 website

2018 Update :::

Kindly note that these php.ini entries are DEPRECATED;

;mbstring.internal_encoding = utf-8
;mbstring.http_input =
;mbstring.http_output = utf-8

Next ...

PHP - Set utf8 for the following - via a config.php file for your web app

 ini_set('default_charset', 'UTF-8');                                    
 mb_internal_encoding('UTF-8');
 iconv_set_encoding('internal_encoding', 'UTF-8');
 iconv_set_encoding('output_encoding', 'UTF-8');

MariaDB / MySQL - Set utf8 via:

 mysqli::set_charset ( "utf8mb4" );

HTML Pages - Set via:

 <meta charset="utf-8" > 

The supposed issues of PHP with Unicode content have been somewhat overstated. I've been doing multilingual websites since 1998 and never knew there might be an issue until I've read about it somewhere - many years and websites later.

This works just fine for me:

Apache configuration (in httpd.conf or .htaccess)

AddDefaultCharset utf-8

PHP (in php.ini)

default_charset = "utf-8"
mbstring.internal_encoding=utf-8
mbstring.http_output=UTF-8
mbstring.encoding_translation=On
mbstring.func_overload=6 

MySQL

CREATE your database with an utf8_* collation, let the tables inherit the database collation and start every connection with "SET NAMES utf8"

HTML (in HEAD element)

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I was facing same issue for UTF-8 characters, Everything was working on live server and staging server, but sometime it's breaking on my dev machine. The behavior was so strange, some times characters was encoded properly but on random page reload it was start breaking with Diamond Charters '���เห็นอเวิลด์!���' or Question mark '??�เห็นอเวิลด์!???' or 85% data was rendering properly 'เห็นอเวิลด์!?��' but rest 15% was showing unmatched characters. I was looking to fix the issue. So, started with my checklist

1 - Check if Character Header Added in HTML


2 - Check if data proper saved in MySQL table


3 - Check if MySQL has proper encoding settings for UTF-8


4 - Check if Apache has Setting to deal with UTF-8 Character set


5 - Check if simple PHP can echo "เห็นอเวิลด์" output same as input "เห็นอเวิลด์"


6 - Check if PHP sending proper Headers output


7 - Check if MySQL Query getting same data "เห็นอเวิลด์"


8 - Check if "เห็นอเวิลด์" has some html characters, deal with them properly


9 - Check if "เห็นอเวิลด์" passing through any html encode decode function


10- Check if .htaccess all set to deal with UTF-8 Character set


Check all the above list to figure out where something..breaking.

Give a try (I am using Codeigniter):

=================================
:: PHP ini Settings::
=================================

default_charset = "utf-8"
mbstring.internal_encoding=utf-8
mbstring.http_output=UTF-8
mbstring.encoding_translation=On
mbstring.func_overload=6 

=================================
:: .htaccess Settings::
=================================

DefaultLanguage en-US
AddDefaultCharset UTF-8

=================================
:: HTML Header Page::
=================================

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

=================================
:: PHP Codeigniter index.php ::
=================================

header('Content-Type: text/html; charset=UTF-8');

=================================
:: Codeigniter config.php ::
=================================

$config['charset'] = 'UTF-8';

=================================
:: Codeigniter database.php ::
=================================

$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';

=================================
:: Codeigniter helper function (optional)
=================================

if(!function_exists('safe_utf_string')){
    function safe_utf_string($utf8string= ''){
        $utf8string = htmlspecialchars($utf8string, ENT_QUOTES, 'UTF-8');
        return mb_convert_encoding($utf8string, 'UTF-8');
    }
}

and Finally don't forget to say Thanks! :) to @djn answer


php copes just fine!

You should set the php.ini "default_charset" parameter to 'utf-8'.

The make sure that:-

<head>
  <meta http-equiv="Content-Type"
    content="text/html; charset=utf-8"
    />

is at the top of every page you serve.

There are a few problem areas:

Databases -- make sure they are configured to use utf-8 by default or enter a world of pain.

IDEs/Editors -- a lot of editors don't support utf-8 well. I normally use vim which doesn't but its never been a big problem.

Documents -- just spent a whole afternoon getting php to read Thai characters out of a spreadsheet. I was eventually successful but am still not sure what I did right.

Tags:

Php

Utf 8