Effective-Date-Range One-Hot-Encode groupby
The question is hard , I can only think of numpy
broadcast to speed up the for loop
s=df.set_index('person_id')[['beg','end']].stack()
l=[]
for x , y in df.groupby('person_id'):
y=y.fillna({'end':y.end.max()})
s1=y.beg.values
s2=y.end.values
t=s.loc[x].values
l.append(pd.DataFrame(((s1-t[:,None]).astype(float)<=0)&((s2-t[:,None]).astype(float)>0),columns=y.nid,index=s.loc[[x]].index))
s=pd.concat([s,pd.concat(l).fillna(0).astype(int)],1).reset_index(level=0).sort_values(['person_id',0])
s
Out[401]:
person_id 0 1 2 3 4
beg 1 2018-01-01 1 0 0 0
beg 1 2018-01-05 1 1 0 0
beg 1 2018-01-10 1 1 1 0
end 1 2018-02-01 0 1 1 0
beg 1 2018-02-05 0 1 1 1
end 1 2018-03-04 0 0 1 1
end 1 2018-10-18 0 0 0 0
beg 2 2018-01-25 1 0 0 0
end 2 2018-11-10 0 0 0 0