How to get a list of all Subversion commit author usernames?
I had to do this in Windows, so I used the Windows port of Super Sed ( http://www.pement.org/sed/ ) - and replaced the AWK & GREP commands:
svn log --quiet --xml | sed -n -e "s/<\/\?author>//g" -e "/[<>]/!p" | sort | sed "$!N; /^\(.*\)\n\1$/!P; D" > USERS.txt
This uses windows "sort" that might not be present on all machines.
To filter out duplicates, take your output and pipe through: sort | uniq
. Thus:
svn log --quiet | grep "^r" | awk '{print $3}' | sort | uniq
I woud not be surprised if this is the way to do what you ask. Unix tools often expect the user to do fancy processing and analysis with other tools.
P.S. Come to think of it, you can merge the grep
and awk
...
svn log --quiet | awk '/^r/ {print $3}' | sort | uniq
P.P.S. Per Kevin Reid...
svn log --quiet | awk '/^r/ {print $3}' | sort -u
P3.S. Per kan, using the vertical bars instead of spaces as field separators, to properly handle names with spaces (also updated the Python examples)...
svn log --quiet | awk -F ' \\\\|' '/^r/ {print $2}' | sort -u
For more efficient, you could do a Perl one-liner. I don't know Perl that well, so I'd wind up doing it in Python:
#!/usr/bin/env python
import sys
authors = set()
for line in sys.stdin:
if line[0] == 'r':
authors.add(line.split('|')[1].strip())
for author in sorted(authors):
print(author)
Or, if you wanted counts:
#!/usr/bin/env python
from __future__ import print_function # Python 2.6/2.7
import sys
authors = {}
for line in sys.stdin:
if line[0] != 'r':
continue
author = line.split('|')[1].strip()
authors.setdefault(author, 0)
authors[author] += 1
for author in sorted(authors):
print(author, authors[author])
Then you'd run:
svn log --quiet | ./authorfilter.py
In PowerShell, set your location to the working copy and use this command.
svn.exe log --quiet |
? { $_ -notlike '-*' } |
% { ($_ -split ' \| ')[1] } |
Sort -Unique
The output format of svn.exe log --quiet
looks like this:
r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
------------------------------------------------------------------------
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)
Filter out the horizontal rules with ? { $_ -notlike '-*' }
.
r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)
Split by ' \| '
to turn a record into an array.
$ 'r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)' -split ' \| '
r20209
tinkywinky
2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
The second element is the name.
Make an array of each line and select the second element with % { ($_ -split ' \| ')[1] }
.
tinkywinky
dispy
lala
po
tinkywinky
Return unique occurrences with Sort -Unique
. This sorts the output as a side effect.
dispy
lala
po
tinkywinky