Regular expression for git repository
Roughly
^[^@]+@[^:]+:[^/]+/[^.]+\.git$
Git accepts a large range of repository URL expressions:
* ssh://[email protected]:port/path/to/repo.git/
* ssh://[email protected]/path/to/repo.git/
* ssh://host.xz:port/path/to/repo.git/
* ssh://host.xz/path/to/repo.git/
* ssh://[email protected]/path/to/repo.git/
* ssh://host.xz/path/to/repo.git/
* ssh://[email protected]/~user/path/to/repo.git/
* ssh://host.xz/~user/path/to/repo.git/
* ssh://[email protected]/~/path/to/repo.git
* ssh://host.xz/~/path/to/repo.git
* [email protected]:/path/to/repo.git/
* host.xz:/path/to/repo.git/
* [email protected]:~user/path/to/repo.git/
* host.xz:~user/path/to/repo.git/
* [email protected]:path/to/repo.git
* host.xz:path/to/repo.git
* rsync://host.xz/path/to/repo.git/
* git://host.xz/path/to/repo.git/
* git://host.xz/~user/path/to/repo.git/
* http://host.xz/path/to/repo.git/
* https://host.xz/path/to/repo.git/
* /path/to/repo.git/
* path/to/repo.git/
* ~/path/to/repo.git
* file:///path/to/repo.git/
* file://~/path/to/repo.git/
For an application that I wrote that requires parsing of these expressions (YonderGit), I came up with the following (Python) regular expressions:
(1) '(\w+://)(.+@)*([\w\d\.]+)(:[\d]+){0,1}/*(.*)'
(2) 'file://(.*)'
(3) '(.+@)*([\w\d\.]+):(.*)'
For most repository URL's encountered "in the wild", I suspect (1) suffices.
I'm using the following regular expression for online remote repositories:
((git|ssh|http(s)?)|(git@[\w\.]+))(:(//)?)([\w\.@\:/\-~]+)(\.git)(/)?
View on Debuggex
FYI I make a regex for get owner and repo from github or bitbucket:
(?P<host>(git@|https://)([\w\.@]+)(/|:))(?P<owner>[\w,\-,\_]+)/(?P<repo>[\w,\-,\_]+)(.git){0,1}((/){0,1})
Debuggex Demo