Filtering rows from dataframe based on the values of the previous rows

You can't get away from looping through each row

Tips

Avoid creating new (expensive to create) objects for each row
Use a memory efficient iteration

I'd use a generator

I'll pass a series to a function and yield the index values for which rows satisfy the conditions.

def f(s):
    it = s.iteritems()
    i, v = next(it)
    yield i                          # Yield the first one
    for j, x in it:
        if .5 * v <= x <= 1.5 * v:
            yield j                  # Yield the ones that satisfy
            v = x                    # Update the comparative value

df.loc[list(f(df.A))]                # Use `loc` with index values
                                     # yielded by my generator

       A
1   1000
2   1000
3   1001
4   1001
6   1000
7   1010
11   999
14  1000

Filtering rows from dataframe based on the values of the previous rows

You can't get away from looping through each row

I'd use a generator

Tags:

Python

Pandas

Python 3.X

Dataframe

Related

Recent Posts