Merge Only When Value is Empty/Null in Pandas
IMO the shortest and yet readable solution is something like that:
df.A.loc[df.A['col'].isna(), 'col'] = df.A.merge(df.B, how='left', on='date')['col_y']
What it basically does is assigning values from merged table column col_y
to primary df.A
table, for those rows in col
column, which are empty (.isna()
condition).
OK assuming that your (null) values are in fact NaN values and not that string then the following works:
In [10]:
# create the merged df
merged = dfA.merge(dfB, on='date')
merged
Out[10]:
date impressions spend col_x col_y
0 2015-01-01 100000 3 ABC123456 NaN
1 2015-01-02 145000 5 ABCD00000 NaN
2 2015-01-03 300000 15 NaN DEF123456
You can use where
to conditionally assign a value from the _x and _y columns:
In [11]:
# now create col_z using where
merged['col_z'] = merged['col_x'].where(merged['col_x'].notnull(), merged['col_y'])
merged
Out[11]:
date impressions spend col_x col_y col_z
0 2015-01-01 100000 3 ABC123456 NaN ABC123456
1 2015-01-02 145000 5 ABCD00000 NaN ABCD00000
2 2015-01-03 300000 15 NaN DEF123456 DEF123456
You can then drop
the extraneous columns:
In [13]:
merged = merged.drop(['col_x','col_y'],axis=1)
merged
Out[13]:
date impressions spend col_z
0 2015-01-01 100000 3 ABC123456
1 2015-01-02 145000 5 ABCD00000
2 2015-01-03 300000 15 DEF123456
If you have got data that contains 'nans' and you want to fill the 'nans' from other dataframe
(that matching the index and columns names) you can do the following:
df_A : target DataFrame that contain nans element
df_B : the source DataFrame thatcomplete the missing elements
df_A = df_A.where(df_A.notnull(),df_B)