Regex for names

While I agree with the answers saying you basically can't do this with regex, I will point out that some of the objections (internationalized characters) can be resolved by using UTF strings and the \p{L} character class (matches a unicode "letter").


  • Hyphenated Names (Worthington-Smythe)

Add a - into the second character class. The easiest way to do that is to add it at the start so that it can't possibly be interpreted as a range modifier (as in a-z).

^[A-Z][-a-zA-Z]+$
  • Names with Apostophies (D'Angelo)

A naive way of doing this would be as above, giving:

^[A-Z][-'a-zA-Z]+$

Don't forget you may need to escape it inside the string! A 'better' way, given your example might be:

^[A-Z]'?[-a-zA-Z]+$

Which will allow a possible single apostrophe in the second position.

  • Names with Spaces (Van der Humpton) - capitals in the middle which may or may not be required is way beyond my interest at this stage.

Here I'd be tempted to just do our naive way again:

^[A-Z]'?[- a-zA-Z]+$

A potentially better way might be:

^[A-Z]'?[- a-zA-Z]( [a-zA-Z])*$

Which looks for extra words at the end. This probably isn't a good idea if you're trying to match names in a body of extra text, but then again, the original wouldn't have done that well either.

  • Joint Names (Ben & Jerry)

At this point you're not looking at single names anymore?

Anyway, as you can see, regexes have a habit of growing very quickly...


THE BEST REGEX EXPRESSIONS FOR NAMES:

  • I will use the term special character to refer to the following three characters:
    1. Dash -
    2. Hyphen '
    3. Dot .
  • Spaces and special characters can not appear twice in a row (e.g.: -- or '. or .. )
  • Trimmed (No spaces before or after)
  • You're welcome ;)

Mandatory single name, WITHOUT spaces, WITHOUT special characters:

^([A-Za-z])+$
  • Sierra is valid, Jack Alexander is invalid (has a space), O'Neil is invalid (has a special character)

Mandatory single name, WITHOUT spaces, WITH special characters:

^[A-Za-z]+(((\'|\-|\.)?([A-Za-z])+))?$
  • Sierra is valid, O'Neil is valid, Jack Alexander is invalid (has a space)

Mandatory single name, optional additional names, WITH spaces, WITH special characters:

^[A-Za-z]+((\s)?((\'|\-|\.)?([A-Za-z])+))*$
  • Jack Alexander is valid, Sierra O'Neil is valid

Mandatory single name, optional additional names, WITH spaces, WITHOUT special characters:

^[A-Za-z]+((\s)?([A-Za-z])+)*$
  • Jack Alexander is valid, Sierra O'Neil is invalid (has a special character)

SPECIAL CASE

Many modern smart devices add spaces at the end of each word, so in my applications I allow unlimited number of spaces before and after the string, then I trim it in the code behind. So I use the following:

Mandatory single name + optional additional names + spaces + special characters:

^(\s)*[A-Za-z]+((\s)?((\'|\-|\.)?([A-Za-z])+))*(\s)*$

Add your own special characters

If you wish to add your own special characters, let's say an underscore _ this is the group you need to update:

(\'|\-|\.)

To

(\'|\-|\.|\_)

PS: If you have questions comment here and I will receive an email and respond ;)


This regex is perfect for me.

^([ \u00c0-\u01ffa-zA-Z'\-])+$

It works fine in php environments using preg_match(), but doesn't work everywhere.

It matches Jérémie O'Co-nor so I think it matches all UTF-8 names.

Tags:

Php

Regex