URL regex validation

I will post, although the question has been accepted.

That regex is still incomplete.

http://www.-1-.de is not a valid domain name but would pass your test.

Here's what I use:

~^
(?:ht|f)tps?://

(?:[a-z0-9] (?:[a-z0-9-]*[a-z0-9])?      \.)*

(?:[a-z0-9][a-z0-9-]{0,62}[a-z0-9])
(?:\.[a-z]{2,5}){1,2}

$~ix

Covers http(s), ftp(s) and .co.uk TLDs and the like. Also covers subdomains which can be 1 character in length (m.example.com for mobile versions of webpages) but will not allow m-.example.com.

Surely some might object as to the regex's completeness, since .pro TLDs require at least 4 characters as a domain name. ;-)

Also IDN domain names will only pass my regex after conversion (i.e. in the "xn--" format).


You have to escape your special characters (/ and that . after www in this case) and att the missing trailing /, like this:

var re = /^(http[s]?:\/\/){0,1}(www\.){0,1}[a-zA-Z0-9\.\-]+\.[a-zA-Z]{2,5}[\.]{0,1}/;
if (!re.test(url)) { 
    alert("url error");
    return false;
}

Just in case you want to know if the url really exists:

function url_exist($url){//se passar a URL existe
    $c=curl_init();
    curl_setopt($c,CURLOPT_URL,$url);
    curl_setopt($c,CURLOPT_HEADER,1);//get the header
    curl_setopt($c,CURLOPT_NOBODY,1);//and *only* get the header
    curl_setopt($c,CURLOPT_RETURNTRANSFER,1);//get the response as a string from curl_exec(), rather than echoing it
    curl_setopt($c,CURLOPT_FRESH_CONNECT,1);//don't use a cached version of the url
    if(!curl_exec($c)){
        //echo $url.' inexists';
        return false;
    }else{
        //echo $url.' exists';
        return true;
    }
    //$httpcode=curl_getinfo($c,CURLINFO_HTTP_CODE);
    //return ($httpcode<400);
}