Python Pandas Expand a Column of List of Lists to Two New Column
You can .apply(pd.Series)
twice to get what you need as an intermediate step, then merge back to the original dataframe.
import pandas as pd
df = pd.DataFrame({
'name': ['john', 'smith'],
'id': [1, 2],
'apps': [[['app1', 'v1'], ['app2', 'v2'], ['app3','v3']],
[['app1', 'v1'], ['app4', 'v4']]]
})
dftmp = df.apps.apply(pd.Series).T.melt().dropna()
dfapp = (dftmp.value
.apply(pd.Series)
.set_index(dftmp.variable)
.rename(columns={0:'app_name', 1:'app_version'})
)
df[['name', 'id']].merge(dfapp, left_index=True, right_index=True)
# returns:
name id app_name app_version
0 john 1 app1 v1
0 john 1 app2 v2
0 john 1 app3 v3
1 smith 2 app1 v1
1 smith 2 app4 v4
Another approach would be (should be quite fast too):
#Repeat the columns without the list by the str length of the list
m=df.drop('apps',1).loc[df.index.repeat(df.apps.str.len())].reset_index(drop=True)
#creating a df exploding the list to 2 columns
n=pd.DataFrame(np.concatenate(df.apps.values),columns=['app_name','app_version'])
#concat them together
df_new=pd.concat([m,n],axis=1)
name id app_name app_version
0 john 1 app1 v1
1 john 1 app2 v2
2 john 1 app3 v3
3 smith 2 app1 v1
4 smith 2 app4 v4
You can always have a brute force solution. Something like:
name, id, app_name, app_version = [], [], [], []
for i in range(len(df)):
for v in df.loc[i,'apps']:
app_name.append(v[0])
app_version.append(v[1])
name.append(df.loc[i, 'name'])
id.append(df.loc[i, 'id'])
df = pd.DataFrame({'name': name, 'id': id, 'app_name': app_name, 'app_version': app_version})
will do the work.
Note that I assumed df['apps'] is lists of strings if df['apps'] is strings then you need: eval(df.loc[i,'apps'])
instead of df.loc[i,'apps']