Regex to extract subdomain from URL?


Then $3 (or \3) will contain "subdomain" if one was supplied.

If you want to have the subdomain in the first group, and your regex engine supports non-capturing groups (shy groups), use this as suggested by palindrom:


Purely the subdomain string (result is $1):


Making http:// optional (result is $2):


Making the http:// and the subdomain optional (result is $3):


It should just be


The sub domain will be the first group.

The problem with the above regex is: if you do not know what the protocol is, or what the domain suffix is, you will get some unexpected results. Here is a little regex accounts for those situations. :D

/(?:http[s]*\:\/\/)*(.*?)\.(?=[^\/]*\..{2,5})/i  //javascript

This should always return your subdomain (if present) in group 1. Here it is in a Javascript example, but it should also work for any other engine that supports positive look-ahead assertions:

// EXAMPLE of use
var regex = /(?:http[s]*\:\/\/)*(.*?)\.(?=[^\/]*\..{2,5})/i
  , whoKnowsWhatItCouldBe = [
                        "" //matches: www
                      , ""// does not match
                      , "" // does not match
                      , ""// does not match
                      , "" // does not match
                      , "" // does not match
                      , "" //matches: what-ever
                      , "" // matches: dev-www
                      , ""//matches: hot-MamaSitas
                  , "" // matches: hot-MamaSitas
                  , "пуст.пустыня.ru" //even non english chars! Woohoo! matches: пуст
                  , "пустыня.ru" //does not match

// Run a loop and test it out.
for ( var i = 0, length = whoKnowsWhatItCouldBe.length; i < length; i++ ){
    var result = whoKnowsWhatItCouldBe[i].match(regex);
    if(result != null){
      // YAY! We have a match!
    } else {
      // Boo... No subdomain was found

