Best way to split a first and last name in PHP
Great library here that so far has parsed names flawlessly: https://github.com/joshfraser/PHP-Name-Parser
The accepted answer doesn't work for languages other than english, or names such as "Oscar de la Hoya".
Here's something I did that I think is utf-8 safe and works for all of those cases, building on the accepted answer's assumption that a prefix and suffix will have a period:
/**
* splits single name string into salutation, first, last, suffix
*
* @param string $name
* @return array
*/
public static function doSplitName($name)
{
$results = array();
$r = explode(' ', $name);
$size = count($r);
//check first for period, assume salutation if so
if (mb_strpos($r[0], '.') === false)
{
$results['salutation'] = '';
$results['first'] = $r[0];
}
else
{
$results['salutation'] = $r[0];
$results['first'] = $r[1];
}
//check last for period, assume suffix if so
if (mb_strpos($r[$size - 1], '.') === false)
{
$results['suffix'] = '';
}
else
{
$results['suffix'] = $r[$size - 1];
}
//combine remains into last
$start = ($results['salutation']) ? 2 : 1;
$end = ($results['suffix']) ? $size - 2 : $size - 1;
$last = '';
for ($i = $start; $i <= $end; $i++)
{
$last .= ' '.$r[$i];
}
$results['last'] = trim($last);
return $results;
}
Here's the phpunit test:
public function testDoSplitName()
{
$array = array(
'FirstName LastName',
'Mr. First Last',
'First Last Jr.',
'Shaqueal O\'neal',
'D’angelo Hall',
'Václav Havel',
'Oscar De La Hoya',
'АБВГҐД ЂЃЕЀЁЄЖЗ', //cyrillic
'דִּיש מַחֲזֹור', //yiddish
);
$assertions = array(
array(
'salutation' => '',
'first' => 'FirstName',
'last' => 'LastName',
'suffix' => ''
),
array(
'salutation' => 'Mr.',
'first' => 'First',
'last' => 'Last',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'First',
'last' => 'Last',
'suffix' => 'Jr.'
),
array(
'salutation' => '',
'first' => 'Shaqueal',
'last' => 'O\'neal',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'D’angelo',
'last' => 'Hall',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'Václav',
'last' => 'Havel',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'Oscar',
'last' => 'De La Hoya',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'АБВГҐД',
'last' => 'ЂЃЕЀЁЄЖЗ',
'suffix' => ''
),
array(
'salutation' => '',
'first' => 'דִּיש',
'last' => 'מַחֲזֹור',
'suffix' => ''
),
);
foreach ($array as $key => $name)
{
$result = Customer::doSplitName($name);
$this->assertEquals($assertions[$key], $result);
}
}
A regex is the best way to handle something like this. Try this piece - it pulls out the prefix, first name, last name and suffix:
$array = array(
'FirstName LastName',
'Mr. First Last',
'First Last Jr.',
'Shaqueal O’neal',
'D’angelo Hall',
);
foreach ($array as $name)
{
$results = array();
echo $name;
preg_match('#^(\w+\.)?\s*([\'\’\w]+)\s+([\'\’\w]+)\s*(\w+\.?)?$#', $name, $results);
print_r($results);
}
The result comes out like this:
FirstName LastName
Array
(
[0] => FirstName LastName
[1] =>
[2] => FirstName
[3] => LastName
)
Mr. First Last
Array
(
[0] => Mr. First Last
[1] => Mr.
[2] => First
[3] => Last
)
First Last Jr.
Array
(
[0] => First Last Jr.
[1] =>
[2] => First
[3] => Last
[4] => Jr.
)
shaqueal o’neal
Array
(
[0] => shaqueal o’neal
[1] =>
[2] => shaqueal
[3] => o’neal
)
d’angelo hall
Array
(
[0] => d’angelo hall
[1] =>
[2] => d’angelo
[3] => hall
)
etc…
so in the array
$array[0]
contains the entire string. $array[2]
is always first name and $array[3]
is always last name.
$array[1]
is prefix and $array[4]
(not always set) is suffix.
I also added code to handle both ' and ’ for names like Shaqueal O’neal and D’angelo Hall.
You won't find a safe way to solve this problem, not even a human can always tell which parts belong to the firstname and which belong to the lastname, especially when one of them contains several words like: Andrea Frank Gutenberg. The middle part Frank can be a second firstname or the lastname with a maiden name Gutenberg.
The best you can do is, to provide different input fields for firstname and lastname, and safe them separated in the database, you can avoid a lot of problems this way.