How to list all GitHub users?
As described here you may rely on those two following APIs to retrieve a JSON formatted output. As requested, both of them provide the gravatar URL.
Collaborators (members of the organization on the project)
- syntax: repos/show/:user/:repo/collaborators [GET]
- example: https://api.github.com/repos/git/git/collaborators
Contributors (authors of, at least, one commit)
- syntax: repos/show/:user/:repo/contributors [GET]
- example: https://api.github.com/repos/git/git/contributors
UPDATE:
The previous API methods requires that you start from a known repository. The two following proposals try to work around this constraint. They rely on the previous version of the API (v2)
Query by email (in your question, you state "I only have the users emails.". Provided the users agreed to publish them you should be able to retreive some information about the user using the email as a query parameter)
- syntax: /api/v2/xml/user/email/:email [GET]
- example: https://github.com/api/v2/xml/user/email/[email protected]
Search for repositories (given some keywords (language, stack, ...) retrieve a list of repositories. Then, for each one, using the two first proposals, list their collaborators and/or contributors)
- syntax: /api/v2/json/repos/search/:here+go+your+keywords [GET]
- example: https://github.com/api/v2/json/repos/search/stackoverflow
Note: Make sure that intended usage of the API is in concordance with GitHub Terms of service
GitHub Archive
https://www.githubarchive.org/
This project can be used to quickly get a dump of all usernames who have ever done anything public.
It exports the GitHub events API to a Google BigQuery dataset frequently.
The data format starting from 2015 is:
SELECT
actor.login
FROM (
TABLE_DATE_RANGE([githubarchive:day.events_],
TIMESTAMP('2015-01-01'),
TIMESTAMP('2015-01-02')
))
GROUP BY actor.login
ORDER BY actor.login
and there is more data starting from 2011-02-12 in a different format, should be easy to figure it out.
Downloading the data takes some fighting with Google BigQuery but is doable: How to download all data in a Google BigQuery dataset?
I have used a similar method to extract all GitHub commit emails at: https://github.com/cirosantilli/all-github-commit-emails