How to duplicate rows in pandas, based on items in a list
You could write a simple cleaning function to make it a list (assuming it's not a list of commas, and you can't simply use ast.literal_eval
):
def clean_string_to_list(s):
return [c for c in s if c not in '[,]'] # you might need to catch errors
df['data'] = df['data'].apply(clean_string_to_list)
Iterating through the rows seems like a reasonable choice:
In [11]: pd.DataFrame([(row['COL'], d)
for d in row['data']
for _, row in df.iterrows()],
columns=df.columns)
Out[11]:
COL data
0 line1 A
1 line1 B
2 line1 C
I'm afraid I don't think pandas caters specifically for this kind of manipulation.
You can use df.explode()
option. Refer to the documentation. I believe this is exactly the functionality you need.