How to filter a set of rows according to an indexed position?

After sorting the dataframe you can use str.split to split the strings in the user column to create a grouping key, then group the dataframe on this grouping key and for each subgroup per user create a mapping of user -> dataframe inside a dict comprehension:

Click to copy

key = df1['user'].str.split().str[0]
dct = {user:grp.reset_index(drop=True) for user, grp in df1.groupby(key)}

Now to access the dataframe corresponding to the user we can simply lookup inside the dictionary:

Click to copy

>>> dct['John']

       user  value
0  John (2)      6
1  John (3)      3
2  John (1)      1

>>> dct['Peter']

        user  value
0  Peter (2)      3
1  Peter (3)      3
2  Peter (1)      1

>>> dct['Johnny']

         user  value
0  Johnny (1)      4
1  Johnny (2)      1

Click to copy

df1 = pd.DataFrame({"user": ["Peter (1)", "Peter (2)", "Peter (3)","John (1)","John (2)","John (3)","Johnny (1)","Johnny (2)"], "value": [1, 3, 3, 1, 6, 3, 4, 1]}, )

df1=df1.sort_values(by='value', ascending=False)

cols = df1.columns.tolist()
df1['name'] = df1['user'].replace(r'\s\(\d\)','',regex=True)
grp = df1.groupby(by=['name'])
dataframes = [grp.get_group(x)[cols] for x in grp.groups]

df2, df3 = dataframes[:2]  # as mentioned, we are interested just in first two users

df2:

Click to copy

       user  value
3  John (1)      1
4  John (2)      6
5  John (3)      3

df3:

Click to copy

       user    value
6  Johnny (1)      4
7  Johnny (2)      1

You can get the first index value and split it and exclude last item(assuming that user name may have parenthesis), and then search for the value in the entire dataframe for that particular column. For example:

Click to copy

firstIndexUser = df1['user'].str.split('(').str[:-1].str.join('(').iloc[0]

This firstIndexUser will have value as 'John' Now you can compare with against the entire dataframe to get your df2

Click to copy

df2 = df1[df1['user'].str.split('(').str[:-1].str.join('(')==firstIndexUser]

The output looks like this:

Click to copy

>>df2
       user  value
0  John (2)      6
4  John (3)      3
6  John (1)      1

If you want, you can reset the index for df2

Click to copy

>>df2.reset_index(drop=True, inplace=True)
>>df2
       user  value
0  John (2)      6
1  John (3)      3
2  John (1)      1

You can follow the similar approach for your df3

How to filter a set of rows according to an indexed position?

Tags:

Python

Pandas

Dataframe

Related

Recent Posts