Pandas how to apply multiple functions to dataframe
For Pandas 0.20.0 or newer, use df.agg
(thanks to ayhan for pointing this out):
In [11]: df.agg(['mean', 'std'])
Out[11]:
one two
mean 5.147471 4.964100
std 2.971106 2.753578
For older versions, you could use
In [61]: df.groupby(lambda idx: 0).agg(['mean','std'])
Out[61]:
one two
mean std mean std
0 5.147471 2.971106 4.9641 2.753578
Another way would be:
In [68]: pd.DataFrame({col: [getattr(df[col], func)() for func in ('mean', 'std')] for col in df}, index=('mean', 'std'))
Out[68]:
one two
mean 5.147471 4.964100
std 2.971106 2.753578
In the general case where you have arbitrary functions and column names, you could do this:
df.apply(lambda r: pd.Series({'mean': r.mean(), 'std': r.std()})).transpose()
mean std
one 5.366303 2.612738
two 4.858691 2.986567
I am using pandas to analyze Chilean legislation drafts. In my dataframe, the list of authors are stored as a string. The answer above did not work for me (using pandas 0.20.3). So I used my own logic and came up with this:
df.authors.apply(eval).apply(len).sum()
Concatenated applies! A pipeline!! The first apply transforms
"['Barros Montero: Ramón', 'Bellolio Avaria: Jaime', 'Gahona Salazar: Sergio']"
into the obvious list, the second apply counts the number of lawmakers involved in the project. I want the size of every pair (lawmaker, project number) (so I can presize an array where I will study which parties work on what).
Interestingly, this works! Even more interestingly, that last call fails if one gets too ambitious and does this instead:
df.autores.apply(eval).apply(len).apply(sum)
with an error:
TypeError: 'int' object is not iterable
coming from deep within /site-packages/pandas/core/series.py in apply
I tried to apply three functions into a column and it works
#removing new line character
rem_newline = lambda x : re.sub('\n',' ',x).strip()
#character lower and removing spaces
lower_strip = lambda x : x.lower().strip()
df = df['users_name'].apply(lower_strip).apply(rem_newline).str.split('(',n=1,expand=True)