How to download and write a file from Github using Requests
Just as an update, https://raw.github.com
was migrated to https://raw.githubusercontent.com
. So the general format is:
url = "https://raw.githubusercontent.com/user/repo/branch/[subfolders]/file"
E.g. https://raw.githubusercontent.com/earnestt1234/seedir/master/setup.py
. Still use requests.get(url)
as in Martijn's answer.
The content of the file in question is included in the returned data. You are getting the full GitHub view of that file, not just the contents.
If you want to download just the file, you need to use the Raw
link at the top of the page, which will be (for your example):
https://raw.github.com/someguy/brilliant/master/somefile.txt
Note the change in domain name, and the blob/
part of the path is gone.
To demonstrate this with the requests
GitHub repository itself:
>>> import requests
>>> r = requests.get('https://github.com/kennethreitz/requests/blob/master/README.rst')
>>> 'Requests:' in r.text
True
>>> r.headers['Content-Type']
'text/html; charset=utf-8'
>>> r = requests.get('https://raw.github.com/kennethreitz/requests/master/README.rst')
>>> 'Requests:' in r.text
True
>>> r.headers['Content-Type']
'text/plain; charset=utf-8'
>>> print r.text
Requests: HTTP for Humans
=========================
.. image:: https://travis-ci.org/kennethreitz/requests.png?branch=master
[... etc. ...]
Adding a working example ready for copy+paste:
import requests
from requests.structures import CaseInsensitiveDict
url = "https://raw.githubusercontent.com/organization/repo/branch/folder/file"
# If repo is private - we need to add a token in header:
headers = CaseInsensitiveDict()
headers["Authorization"] = "token TOKEN"
resp = requests.get(url, headers=headers)
print(resp.status_code)
(*) If repo is not private - remove the headers part.
Bonus:
Check out this Curl < --> Python-requests online converter.
You need to request the raw version of the file, from https://raw.github.com
.
See the difference:
https://raw.github.com/django/django/master/setup.py vs. https://github.com/django/django/blob/master/setup.py
Also, you should probably add a /
between your directory and the filename:
>>> getcwd()+'foo.txt'
'/Users/burhanfoo.txt'
>>> import os
>>> os.path.join(getcwd(),'foo.txt')
'/Users/burhan/foo.txt'