Find first and last non-zero column in each row of a pandas dataframe

Using cumsum on the underlying array

m = df.drop(['Name', 'count'], axis=1)
u = m.to_numpy().cumsum(1)

start = (u!=0).argmax(1)
end = u.argmax(1)

df.assign(start=m.columns[start], end=m.columns[end])

    Name  Jan17  Jun18  Dec18  Apr19  count  start    end
0   Nick    0.0    1.7    3.7    0.0      2  Jun18  Dec18
1   Jack    0.0    0.0    2.8    3.5      2  Dec18  Apr19
2    Fox    0.0    1.7    0.0    0.0      1  Jun18  Jun18
3    Rex    1.0    0.0    3.0    4.2      3  Jan17  Apr19
4  Snack    0.0    0.0    2.8    4.4      2  Dec18  Apr19
5  Yosee    0.0    0.0    0.0    4.3      1  Apr19  Apr19
6  Petty    0.5    1.3    2.8    3.5      4  Jan17  Apr19

first_valid_index and last_valid_index

d = df.mask(df == 0).drop(['Name', 'count'], 1)
df.assign(
    Start=d.apply(pd.Series.first_valid_index, 1),
    Finish=d.apply(pd.Series.last_valid_index, 1)
)

    Name  Jan17  Jun18  Dec18  Apr19  count  Start Finish
0   Nick    0.0    1.7    3.7    0.0      2  Jun18  Dec18
1   Jack    0.0    0.0    2.8    3.5      2  Dec18  Apr19
2    Fox    0.0    1.7    0.0    0.0      1  Jun18  Jun18
3    Rex    1.0    0.0    3.0    4.2      3  Jan17  Apr19
4  Snack    0.0    0.0    2.8    4.4      2  Dec18  Apr19
5  Yosee    0.0    0.0    0.0    4.3      1  Apr19  Apr19
6  Petty    0.5    1.3    2.8    3.5      4  Jan17  Apr19

stack then groupby

d = df.mask(df == 0).drop(['Name', 'count'], 1)
def fl(s): return s.xs(s.name).index[[0, -1]]
s, f = d.stack().groupby(level=0).apply(fl).str
df.assign(Start=s, Finish=f)

    Name  Jan17  Jun18  Dec18  Apr19  count  Start Finish
0   Nick    0.0    1.7    3.7    0.0      2  Jun18  Dec18
1   Jack    0.0    0.0    2.8    3.5      2  Dec18  Apr19
2    Fox    0.0    1.7    0.0    0.0      1  Jun18  Jun18
3    Rex    1.0    0.0    3.0    4.2      3  Jan17  Apr19
4  Snack    0.0    0.0    2.8    4.4      2  Dec18  Apr19
5  Yosee    0.0    0.0    0.0    4.3      1  Apr19  Apr19
6  Petty    0.5    1.3    2.8    3.5      4  Jan17  Apr19

idxmax

mask = df.drop(['Name', 'count'], axis=1) > 0
df.assign(start=mask.idxmax(axis=1), end=mask.iloc[:,::-1].idxmax(axis=1))

    Name  Jan17  Jun18  Dec18  Apr19  count  start    end
0   Nick    0.0    1.7    3.7    0.0      2  Jun18  Dec18
1   Jack    0.0    0.0    2.8    3.5      2  Dec18  Apr19
2    Fox    0.0    1.7    0.0    0.0      1  Jun18  Jun18
3    Rex    1.0    0.0    3.0    4.2      3  Jan17  Apr19
4  Snack    0.0    0.0    2.8    4.4      2  Dec18  Apr19
5  Yosee    0.0    0.0    0.0    4.3      1  Apr19  Apr19
6  Petty    0.5    1.3    2.8    3.5      4  Jan17  Apr19

Drop irrelevant columns, then use idxmax first on the columns, then on the reversed columns to find the first and last valid indices respectively.


In your case try something different with dot

s=df.loc[:,'Jan17':'Apr19'].ne(0)
s=s.dot(s.columns+',').str[:-1].str.split(',')
s.str[0],s.str[-1]
Out[899]: 
(0    Jun18
 1    Dec18
 2    Jun18
 3    Jan17
 4    Dec18
 5    Apr19
 6    Jan17
 dtype: object, 0    Dec18
 1    Apr19
 2    Jun18
 3    Apr19
 4    Apr19
 5    Apr19
 6    Apr19
 dtype: object)
 #df['Start'],df['End']=s.str[0],s.str[-1]