Pandas Dataframe to Nested JSON
Try that :
file.csv:
student,date,grade,course
0,Student_1,2017-06-25,93,ENGLISH
1,Student_2,2017-06-25,83,ENGLISH
2,Student_1,2017-06-25,93,MATH
3,Student_2,2017-06-25,83,MATH
4,Student_1,2017-06-26,90,MATH
5,Student_2,2017-06-26,85,MATH
6,Student_1,2017-06-26,96,ENGLISH
7,Student_2,2017-06-26,99,ENGLISH
Execute:
from collections import defaultdict
import json
import pandas as pd
df = pd.read_csv('file.csv')
json_doc = defaultdict(list)
for _id in df.T:
data = df.T[_id]
key = data.course
for elt in json_doc[key]:
if elt["date"] == data.date:
elt[data.student] = data.grade
break
else:
values = {'date': data.date, data.student: data.grade}
json_doc[key].append(values)
print(json.dumps(json_doc, indent=4))
Output:
{
"ENGLISH": [
{
"date": "2017-06-25",
"Student_1": 93,
"Student_2": 83
},
{
"date": "2017-06-26",
"Student_1": 96,
"Student_2": 99
}
],
"MATH": [
{
"date": "2017-06-25",
"Student_1": 93,
"Student_2": 83
},
{
"date": "2017-06-26",
"Student_1": 90,
"Student_2": 85
}
]
}
You can first define a function to convert sub-groups to json, then apply this function to each group, and then merge sub-group jsons to a single json object.
def f(x):
return (dict({'date':x.date.iloc[0]},**{k:v for k,v in zip(x.student,x.grade)}))
(
df.groupby(['course','date'])
.apply(f)
.groupby(level=0)
.apply(lambda x: x.tolist())
.to_dict()
)
Out[1006]:
{'ENGLISH': [{'Student_1': 93, 'Student_2': 83, 'date': '2017-06-25'},
{'Student_1': 96, 'Student_2': 99, 'date': '2017-06-26'}],
'MATH': [{'Student_1': 93, 'Student_2': 83, 'date': '2017-06-25'},
{'Student_1': 90, 'Student_2': 85, 'date': '2017-06-26'}]}
If you first have multiples indexes in your DataFrame
and you do myDataframe.to_dict(orient='index')
then it will create a dictionary where
key=tuple
and value="the remaining non-indexed columns"
.
You can simply create a recursive function that will create a dict
as nested as the number of elements in the tuple
key as follows:
def recurse(test):
lentpl=len(list(test.keys())[0])
if lentpl==2:
return {k[0]:{k[1]:v} for k,v in test.items()}
else:
test2={k[0:-1]:{k[-1]:v} for k,v in test.items()}
return recurse(test2)