Getting Google Spreadsheet CSV into A Pandas Dataframe
You can use read_csv()
on a StringIO
object:
from io import BytesIO
import requests
import pandas as pd
r = requests.get('https://docs.google.com/spreadsheet/ccc?key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
data = r.content
In [10]: df = pd.read_csv(BytesIO(data), index_col=0,parse_dates=['Quradate'])
In [11]: df.head()
Out[11]:
City region Res_Comm \
0 Dothan South_Central-Montgomery-Auburn-Wiregrass-Dothan Residential
10 Foley South_Mobile-Baldwin Residential
12 Birmingham North_Central-Birmingham-Tuscaloosa-Anniston Commercial
38 Brent North_Central-Birmingham-Tuscaloosa-Anniston Residential
44 Athens North_Huntsville-Decatur-Florence Residential
mkt_type Quradate National_exp Alabama_exp Sales_exp \
0 Rural 2010-01-15 00:00:00 2 2 3
10 Suburban_Urban 2010-01-15 00:00:00 4 4 4
12 Suburban_Urban 2010-01-15 00:00:00 2 2 3
38 Rural 2010-01-15 00:00:00 3 3 3
44 Suburban_Urban 2010-01-15 00:00:00 4 5 4
Inventory_exp Price_exp Credit_exp
0 2 3 3
10 4 4 3
12 2 2 3
38 3 3 2
44 4 4 4
My approach is a bit different. I just used pandas.Dataframe() but obviously needed to install and import gspread. And it worked fine!
gsheet = gs.open("Name")
Sheet_name ="today"
wsheet = gsheet.worksheet(Sheet_name)
dataframe = pd.DataFrame(wsheet.get_all_records())
Seems to work for me without the StringIO
:
test = pd.read_csv('https://docs.google.com/spreadsheets/d/' +
'0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc' +
'/export?gid=0&format=csv',
# Set first column as rownames in data frame
index_col=0,
# Parse column values to datetime
parse_dates=['Quradate']
)
test.head(5) # Same result as @TomAugspurger
BTW, including the ?gid=
enables importing different sheets, find the gid in the URL.
Open the specific sheet you want in your browser. Make sure it's at least viewable by anyone with the link. Copy and paste the URL. You'll get something like https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/edit#gid=NUMBER
.
sheet_url = 'https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/edit#gid=NUMBER'
First we turn that into a CSV export URL, like https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/export?format=csv&gid=NUMBER
:
csv_export_url = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')
Then we pass it to pd.read_csv, which can take a URL.
df = pd.read_csv(csv_export_url)
This will break if Google changes its API (it seems undocumented), and may give unhelpful errors if a network failure occurs.