Highlight a column value based off another column value in pandas
Key points
- You need to access values in the multiple columns for your
lambda
function, so apply to the whole dataframe instead of the price column only. - The above also solves the issue that apply for a series has no
axis
argument. - Add
else x
to fix the syntax error in the conditional logic for yourlambda
- When you index
x
in thelambda
it is a value, no longer a series, so kill thestr
attribute calls and just calllen
on it.
So try:
data.apply(lambda x: highlight_otls(x) if len(x['outlier'].split(',')) >= 2 else x, axis=1)
Output
0 [background-color: yellow]
1 [background-color: yellow]
2 [None, None]
3 [None, None]
dtype: object
One way to deal with null outlier values as per your comment is to refactor the highlighting conditional logic into the highlight_otls
function:
def highlight_otls(x):
if len(x['outlier'].split(',')) >= 2:
return ['background-color: yellow']
else:
return x
data.apply(lambda x: highlight_otls(x) if pd.notnull(x['outlier']) else x, axis=1)
By the way, you may want to return something like ['background-color: white']
instead of x
when you don't want to apply highlighting.
I suggest use custom function for return styled DataFrame
by condition, last export Excel file:
def highlight_otls(x):
c1 = 'background-color: yellow'
c2 = ''
mask = x['outlier'].str.split(',').str.len() >= 2
df1 = pd.DataFrame(c2, index=df.index, columns=df.columns)
#modify values of df1 column by boolean mask
df1.loc[mask, 'price'] = c1
#check styled DataFrame
print (df1)
price outlier
0 background-color: yellow
1 background-color: yellow
2
3
return df1
df.style.apply(highlight_otls, axis=None).to_excel('styled.xlsx', engine='openpyxl')
Use Styler.apply
. (To output to xlsx
format, use to_excel
function.)
Suppose one's dataset is
other price outlier
0 X 1.99 F,C
1 X 1.49 L,C
2 X 1.99 F
3 X 1.39 N
def hightlight_price(row):
ret = ["" for _ in row.index]
if len(row.outlier.split(",")) >= 2:
ret[row.index.get_loc("price")] = "background-color: yellow"
return ret
df.style.\
apply(hightlight_price, axis=1).\
to_excel('styled.xlsx', engine='openpyxl')
From the documentation, "DataFrame.style
attribute is a property that returns a Styler object."
We pass our styling function, hightlight_price
, into Styler.apply
and demand a row-wise nature of the function with axis=1
. (Recall that we want to color the price
cell in each row based on the outlier
information in the same row.)
Our function hightlight_price
will generate the visual styling for each row. For each row row
, we first generate styling for other
, price
, and outlier
column to be ["", "", ""]
. We can obtain the right index to modify only the price
part in the list with row.index.get_loc("price")
as in
ret[row.index.get_loc("price")] = "background-color: yellow"
# ret becomes ["", "background-color: yellow", ""]
Results