Deleting rows which sum to zero in 1 column but are otherwise duplicates in pandas
I put some comments in the code, so hopefully, my line of thought should be clear :
cond = df.assign(temp=df.Val2.abs())
# a way to get the same values (differentiated by their sign)
# to follow each other
cond = cond.sort_values(["ID", "Val1", "temp"])
# cumsum should yield a zero for numbers that are different
# only by their sign
cond["check"] = cond.groupby(["ID", "temp"]).Val2.cumsum()
cond["check"] = np.where(cond.check != 0, np.nan, cond.check)
# the backward fill here allows us to assign an identifier
# to the two values that summed to zero
cond["check"] = cond["check"].bfill(limit=1)
# this is where we implement your other condition
# essentially, it looks for rows that are duplicates
# and rows that any two rows sum to zero
cond.loc[
~(cond.duplicated(["ID", "Val1"], keep=False) & (cond.check == 0)),
["ID", "Val1", "Val2"],
]
ID Val1 Val2
2 A001 2 50
3 A002 5 -40
4 A002 6 40
6 A003 8 -50
5 A003 8 60
9 A004 3 10