Highlight a column value based off another column value in pandas

Key points

You need to access values in the multiple columns for your lambda function, so apply to the whole dataframe instead of the price column only.
The above also solves the issue that apply for a series has no axis argument.
Add else x to fix the syntax error in the conditional logic for your lambda
When you index x in the lambda it is a value, no longer a series, so kill the str attribute calls and just call len on it.

So try:

data.apply(lambda x: highlight_otls(x) if len(x['outlier'].split(',')) >= 2 else x, axis=1)

Output

0    [background-color: yellow]
1    [background-color: yellow]
2                  [None, None]
3                  [None, None]
dtype: object

One way to deal with null outlier values as per your comment is to refactor the highlighting conditional logic into the highlight_otls function:

def highlight_otls(x):                                                            
     if len(x['outlier'].split(',')) >= 2:
         return ['background-color: yellow']
     else:
         return x

data.apply(lambda x: highlight_otls(x) if pd.notnull(x['outlier']) else x, axis=1)

By the way, you may want to return something like ['background-color: white'] instead of x when you don't want to apply highlighting.

I suggest use custom function for return styled DataFrame by condition, last export Excel file:

def highlight_otls(x):
    c1 = 'background-color: yellow'
    c2 = '' 

    mask = x['outlier'].str.split(',').str.len() >= 2
    df1 =  pd.DataFrame(c2, index=df.index, columns=df.columns)
    #modify values of df1 column by boolean mask
    df1.loc[mask, 'price'] = c1

    #check styled DataFrame
    print (df1)

                          price outlier
    0  background-color: yellow        
    1  background-color: yellow        
    2                                  
    3                               
    return df1

df.style.apply(highlight_otls, axis=None).to_excel('styled.xlsx', engine='openpyxl')

Use Styler.apply. (To output to xlsx format, use to_excel function.)

Suppose one's dataset is

other   price   outlier
0   X   1.99    F,C
1   X   1.49    L,C
2   X   1.99    F
3   X   1.39    N

def hightlight_price(row):
    ret = ["" for _ in row.index]
    if len(row.outlier.split(",")) >= 2:
        ret[row.index.get_loc("price")] = "background-color: yellow"
    return ret
       
df.style.\
    apply(hightlight_price, axis=1).\
    to_excel('styled.xlsx', engine='openpyxl')

From the documentation, "DataFrame.style attribute is a property that returns a Styler object."

We pass our styling function, hightlight_price, into Styler.apply and demand a row-wise nature of the function with axis=1. (Recall that we want to color the price cell in each row based on the outlier information in the same row.)

Our function hightlight_price will generate the visual styling for each row. For each row row, we first generate styling for other, price, and outlier column to be ["", "", ""]. We can obtain the right index to modify only the price part in the list with row.index.get_loc("price") as in

ret[row.index.get_loc("price")] = "background-color: yellow"
# ret becomes ["", "background-color: yellow", ""]

Results

enter image description here

Highlight a column value based off another column value in pandas

Tags:

Python

Pandas

Related

Recent Posts