REGEX: Capture Filename from URL without file extension
tested and works, even for pages without file extension.
var re = /([\w\d_-]*)\.?[^\\\/]*$/i;
var url = "http://stackoverflow.com/questions/3671522/regex-capture-filename-from-url-without-file-extention";
alert(url.match(re)[1]); // 'regex-capture-filename-from-url-without-file-extention'
url = 'http://gunblad3.blogspot.com/2008/05/uri-url-parsing.html';
alert(url.match(re)[1]); // 'uri-url-parsing'
([\w\d_-]*)
get a string containing letters, digits, underscores or hyphens.\.?
perhaps the string is followed by a period.[^\\\/]*$
but certainly not followed by a slash or backslash till the very end./i
oh yeh, ignore case.
var url = "http://example.com/index.htm";
var filename = url.match(/([^\/]+)(?=\.\w+$)/)[0];
Let's go through the regular expression:
[^\/]+ # one or more character that isn't a slash
(?= # open a positive lookahead assertion
\. # a literal dot character
\w+ # one or more word characters
$ # end of string boundary
) # end of the lookahead
This expression will collect all characters that aren't a slash that are immediately followed (thanks to the lookahead) by an extension and the end of the string -- or, in other words, everything after the last slash and until the extension.
Alternately, you can do this without regular expressions altogether, by finding the position of the last /
and the last .
using lastIndexOf
and getting a substring
between those points:
var url = "http://example.com/index.htm";
var filename = url.substring(url.lastIndexOf("/") + 1, url.lastIndexOf("."));