How to split a dataframe by unique groups and save to a csv
You can obtain the unique values calling unique
, iterate over this, build the filename and write this out to csv:
genes = df['Gene'].unique()
for gene in genes:
outfilename = gene + '.pdf'
print(outfilename)
df[df['Gene'] == gene].to_csv(outfilename)
HAPPY.pdf
SAD.pdf
LEG.pdf
A more pandas-thonic method is to groupby on 'Gene'
and then iterate over the groups:
gp = df.groupby('Gene')
# groups() returns a dict with 'Gene':indices as k:v pair
for g in gp.groups.items():
print(df.loc[g[1]])
chr start end Gene Value MoreData
0 chr1 123 123 HAPPY 41.1 3.4
1 chr1 125 129 HAPPY 45.9 4.5
2 chr1 140 145 HAPPY 39.3 4.1
chr start end Gene Value MoreData
3 chr1 342 355 SAD 34.2 9.0
4 chr1 360 361 SAD 44.3 8.1
5 chr1 390 399 SAD 29.0 7.2
6 chr1 400 411 SAD 35.6 6.5
chr start end Gene Value MoreData
7 chr1 462 470 LEG 20 2.7