REGEX: Capture Filename from URL without file extension

tested and works, even for pages without file extension.

var re = /([\w\d_-]*)\.?[^\\\/]*$/i;

var url = "http://stackoverflow.com/questions/3671522/regex-capture-filename-from-url-without-file-extention";
alert(url.match(re)[1]); // 'regex-capture-filename-from-url-without-file-extention'

url = 'http://gunblad3.blogspot.com/2008/05/uri-url-parsing.html';
alert(url.match(re)[1]); // 'uri-url-parsing'

([\w\d_-]*) get a string containing letters, digits, underscores or hyphens.
\.? perhaps the string is followed by a period.
[^\\\/]*$ but certainly not followed by a slash or backslash till the very end.
/i oh yeh, ignore case.

var url = "http://example.com/index.htm";
var filename = url.match(/([^\/]+)(?=\.\w+$)/)[0];

Let's go through the regular expression:

[^\/]+    # one or more character that isn't a slash
(?=       # open a positive lookahead assertion
  \.      # a literal dot character
  \w+     # one or more word characters
  $       # end of string boundary
)         # end of the lookahead

This expression will collect all characters that aren't a slash that are immediately followed (thanks to the lookahead) by an extension and the end of the string -- or, in other words, everything after the last slash and until the extension.

Alternately, you can do this without regular expressions altogether, by finding the position of the last / and the last . using lastIndexOf and getting a substring between those points:

var url = "http://example.com/index.htm";
var filename = url.substring(url.lastIndexOf("/") + 1, url.lastIndexOf("."));

REGEX: Capture Filename from URL without file extension

Tags:

Javascript

Url

Regex

Related

Recent Posts