What is the regex to validate Linux users?
Sorry for necrobumping this almost 4-year-old question, but it comes up pretty high on Internet search results and it warrants a little more attention.
A more accurate regex is (yes, I know, despite the man page):
^[a-z_]([a-z0-9_-]{0,31}|[a-z0-9_-]{0,30}\$)$
Hopefully that helps some of those searching.
To break it down:
- It should start (
^
) with only lowercase letters or an underscore ([a-z_]
). This occupies exactly 1 character. - Then it should be one of either (
( ... )
):- From 0 to 31 characters (
{0,31}
) of letters, numbers, underscores, and/or hyphens ([a-z0-9_-]
), OR (|
) - From 0 to 30 characters of the above plus a USD symbol (
\$
) at the end, and then
- From 0 to 31 characters (
- No more characters past this pattern (
$
).
For those unfamiliar with regex patterns, you may ask why the dollar sign had a backslash in 2.2. but did not in 3. This is because in most (all?) regex variants, the dollar sign indicates the end of a string (or line, etc.). Depending on the engine being used, it will need to be escaped if it's part of the actual string (I can't think off the top of my head of a regex engine that doesn't use backslash as an escape for a pure expression).
Note that Debian and Ubuntu remove some restrictions for a fully POSIX/shadow upstream compliant username (for instance, and I don't know if this has been fixed, but they allow the username to start with a number - which actually is what caused this bug). If you want to guarantee cross-platform, I'd recommend the above regex pattern rather than what passes/fails the check in Debian, Ubuntu, and others.
From the man page of useradd (8):
It is usually recommended to only use usernames that begin with a lower case letter or an underscore, followed by lower case letters, digits, underscores, or dashes. They can end with a dollar sign. In regular expression terms: [a-z_][a-z0-9_-]*[$]?
On Debian, the only constraints are that usernames must neither start with a dash ('-') nor contain a colon (':') or a whitespace (space: ' ', end of line: '\n', tabulation: '\t', etc.). Note that using a slash ('/') may break the default algorithm for the definition of the user's home directory.
Usernames may only be up to 32 characters long.
So, there's a general recommendation. The actual constraints depend on the specifics of your implementation / distribution. On Debian-based systems, apparently there are no very hard constraints. In fact, I just tried useradd '€'
on my Ubuntu box, and it worked. Of course, this may break some applications that do not expect such unusual usernames. To avoid such problems, it is best to follow the general recommendation.
The general rule for username is its length must less than 32 characters. It depend on your distribution to make what is valid username.
In Debian, shadow-utils 4.1
, there is a is_valid_name
function in chkname.c
:
static bool is_valid_name (const char *name)
{
/*
* User/group names must match [a-z_][a-z0-9_-]*[$]
*/
if (('\0' == *name) ||
!((('a' <= *name) && ('z' >= *name)) || ('_' == *name))) {
return false;
}
while ('\0' != *++name) {
if (!(( ('a' <= *name) && ('z' >= *name) ) ||
( ('0' <= *name) && ('9' >= *name) ) ||
('_' == *name) ||
('-' == *name) ||
( ('$' == *name) && ('\0' == *(name + 1)) )
)) {
return false;
}
}
return true;
}
And the length of username was checked before:
bool is_valid_user_name (const char *name)
{
/*
* User names are limited by whatever utmp can
* handle.
*/
if (strlen (name) > USER_NAME_MAX_LENGTH) {
return false;
}
return is_valid_name (name);
}