Pandas DataFrames: Create new rows with calculations across existing rows

There are quite possibly many ways. Here's one using groupby and unstack:

(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))

  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20

IIUC

Click to copy

df=df.set_index(['Country','Industry'])

Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net

pivot_table

Click to copy

df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').\
  diff(axis=1).\
     dropna(1).\
        rename(columns={'Import':'Value'}).\
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0

You can do it this way to add those rows to your original dataframe:

Click to copy

df.set_index(['Country','Industry','Field'])\
  .unstack()['Value']\
  .eval('Net = Import - Export')\
  .stack().rename('Value').reset_index()

Output:

Click to copy

   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70

Pandas DataFrames: Create new rows with calculations across existing rows

Tags:

Python

Pandas

Dataframe

Related

Recent Posts