Why is in_array strict mode on integers slower than non-strict mode?
It seems to have something to do with the type of element in the needle and/or haystack, observe:
PHP 7.3.5 from http://sandbox.onlinephpfunctions.com/
$iterations = 10000000;
$needle = false;
$haystack = [ true ];
$start = microtime( true );
for( $i = 0; $i < $iterations; ++$i )
{
in_array( $needle, $haystack, true );
}
echo ( microtime( true ) - $start ).' strict'.PHP_EOL;
$start = microtime( true );
for( $i = 0; $i < $iterations; ++$i )
{
in_array( $needle, $haystack, false );
}
echo ( microtime( true ) - $start ).' not strict';
produces:
0.29996585845947 strict
0.40397191047668 not strict
but if we use:
$needle = 1;
$haystack = [ 2 ];
then we get:
0.34480714797974 strict
0.28275084495544 not strict
However, PHP 5.6.29 produces a negligible discrepancy and running the same test multiple times can put strict ahead of non-strict and vice versa.
I can offer some small insight from tracing through the C source for in_array
.
It turns out, when comparing integers, the path to reach the actual equality check for non-strict mode involves fewer operations than strict mode.
Strict mode
In the case where the strict
flag to in_array
is true, the following occurs:
We call
fast_is_identical_function
for each element in the arrayThe
fast_is_identical_function
first tests that the types of each operand are different (Z_TYPE_P(op1) != Z_TYPE_P(op2)
) in hopes of being able to returnfalse
early; this is comparison #1.If the types are the same (they are, in your test case), we then test
(Z_TYPE_P(op1) <= IS_TRUE
; I've no idea what this does, but it's comparison #2.After both comparisons have evaluated to
false
, we jump intozend_is_identical
, our first function invocation.zend_is_identical
starts out by again testingZ_TYPE_P(op1) != Z_TYPE_P(op2)
, another attempt to fail early. This is comparison #3.If the types are the same, we can descend through the
switch (Z_TYPE_P(op1))
statement, comparison #4Finally we reach the comparison
Z_LVAL_P(op1) == Z_LVAL_P(op2)
which actually tests the equality of the two values, comparison #5.
In total, to test whether each element of the array is equal to the value we're searching for, there are 5 comparisons and 1 function invocation.
Non-strict mode
By comparison, the non-strict flow for integers specifically (really LONG
s) is much simpler, as follows:
Instead of
fast_is_identical_function
, we instead usefast_equal_check_function
for each element in the array.The method
fast_equal_check_function
starts a much more complicated process of comparing the two values with all kinds of type-casting logic. However, the very first test it does happens to be optimized for integers, as follows:if (EXPECTED(Z_TYPE_P(op1) == IS_LONG)) { if (EXPECTED(Z_TYPE_P(op2) == IS_LONG)) { return Z_LVAL_P(op1) == Z_LVAL_P(op2);
We can see that it...
- immediately tests whether type of
op1
isLONG
, which it is, then - immediately tests whether the type of
op2
isLONG
, which it is, then - immediately returns the result of
Z_LVAL_P(op1) == Z_LVAL_P(op2)
- immediately tests whether type of
A total of 3 simple equality comparisons and 0 function invocations for the non-strict case, vs at least 5 comparisons and 1 jump for the strict case.
This appears to be a case where attempted early optimization makes the strict check slower (by repeatedly testing the types of the operands in hopes that we can find inequality sooner) than the specific non-strict case of comparing two integers.