Preferred method to store PHP arrays (json_encode vs serialize)

I've written a blogpost about this subject: "Cache a large array: JSON, serialize or var_export?". In this post it is shown that serialize is the best choice for small to large sized arrays. For very large arrays (> 70MB) JSON is the better choice.


You might also be interested in https://github.com/phadej/igbinary - which provides a different serialization 'engine' for PHP.

My random/arbitrary 'performance' figures, using PHP 5.3.5 on a 64bit platform show :

JSON :

  • JSON encoded in 2.180496931076 seconds
  • JSON decoded in 9.8368630409241 seconds
  • serialized "String" size : 13993

Native PHP :

  • PHP serialized in 2.9125759601593 seconds
  • PHP unserialized in 6.4348418712616 seconds
  • serialized "String" size : 20769

Igbinary :

  • WIN igbinary serialized in 1.6099879741669 seconds
  • WIN igbinrary unserialized in 4.7737920284271 seconds
  • WIN serialized "String" Size : 4467

So, it's quicker to igbinary_serialize() and igbinary_unserialize() and uses less disk space.

I used the fillArray(0, 3) code as above, but made the array keys longer strings.

igbinary can store the same data types as PHP's native serialize can (So no problem with objects etc) and you can tell PHP5.3 to use it for session handling if you so wish.

See also http://ilia.ws/files/zendcon_2010_hidden_features.pdf - specifically slides 14/15/16


JSON is simpler and faster than PHP's serialization format and should be used unless:

  • You're storing deeply nested arrays: json_decode(): "This function will return false if the JSON encoded data is deeper than 127 elements."
  • You're storing objects that need to be unserialized as the correct class
  • You're interacting with old PHP versions that don't support json_decode

Depends on your priorities.

If performance is your absolute driving characteristic, then by all means use the fastest one. Just make sure you have a full understanding of the differences before you make a choice

  • Unlike serialize() you need to add extra parameter to keep UTF-8 characters untouched: json_encode($array, JSON_UNESCAPED_UNICODE) (otherwise it converts UTF-8 characters to Unicode escape sequences).
  • JSON will have no memory of what the object's original class was (they are always restored as instances of stdClass).
  • You can't leverage __sleep() and __wakeup() with JSON
  • By default, only public properties are serialized with JSON. (in PHP>=5.4 you can implement JsonSerializable to change this behavior).
  • JSON is more portable

And there's probably a few other differences I can't think of at the moment.

A simple speed test to compare the two

<?php

ini_set('display_errors', 1);
error_reporting(E_ALL);

// Make a big, honkin test array
// You may need to adjust this depth to avoid memory limit errors
$testArray = fillArray(0, 5);

// Time json encoding
$start = microtime(true);
json_encode($testArray);
$jsonTime = microtime(true) - $start;
echo "JSON encoded in $jsonTime seconds\n";

// Time serialization
$start = microtime(true);
serialize($testArray);
$serializeTime = microtime(true) - $start;
echo "PHP serialized in $serializeTime seconds\n";

// Compare them
if ($jsonTime < $serializeTime) {
    printf("json_encode() was roughly %01.2f%% faster than serialize()\n", ($serializeTime / $jsonTime - 1) * 100);
}
else if ($serializeTime < $jsonTime ) {
    printf("serialize() was roughly %01.2f%% faster than json_encode()\n", ($jsonTime / $serializeTime - 1) * 100);
} else {
    echo "Impossible!\n";
}

function fillArray( $depth, $max ) {
    static $seed;
    if (is_null($seed)) {
        $seed = array('a', 2, 'c', 4, 'e', 6, 'g', 8, 'i', 10);
    }
    if ($depth < $max) {
        $node = array();
        foreach ($seed as $key) {
            $node[$key] = fillArray($depth + 1, $max);
        }
        return $node;
    }
    return 'empty';
}