Pandas dataframe.query method syntax
@x.name
- @
helps .query()
to understand that x
is an external object (doesn't belong to the DataFrame for which the query() method was called). In this case x
is a DataFrame. It could be a scalar value as well.
I hope this small demonstration will help you to understand it:
In [79]: d1
Out[79]:
a b c
0 1 2 3
1 4 5 6
2 7 8 9
In [80]: d2
Out[80]:
a x
0 1 10
1 7 11
In [81]: d1.query("a in @d2.a")
Out[81]:
a b c
0 1 2 3
2 7 8 9
In [82]: d1.query("c < @d2.a")
Out[82]:
a b c
1 4 5 6
Scalar x
:
In [83]: x = 9
In [84]: d1.query("c == @x")
Out[84]:
a b c
2 7 8 9
Everything @MaxU said is perfect!
I wanted to add some context to the specific problem that this was applied to.
find_match
This is a helper function that is used in the dataframe dfWeeks.apply
. Two things to note:
find_match
takes a single argumentx
. This will be a single row ofdfWeeks
.- Each row is a
pd.Series
object and each row will be passed through this function. This is the nature of usingapply
. - When
apply
passes this row to the helper function, the row has aname
attribute that is equal to the index value for that row in the dataframe. In this case, I know that the index value is apd.Timestamp
and I'll use it to do the comparing I need to do.
- Each row is a
find_match
referencesdfDays
which is outside the scope offind_match
itself.
I didn't have to use query
... I like using query
. It is my opinion that it makes some code prettier. The following function, as provided by the OP, could've been written differently
def find_match(x):
"""Original"""
match = dfDays.query('index > @x.name & price >= @x.target')
if not match.empty:
return match.index[0]
dfWeeks.assign(target_hit=dfWeeks.apply(find_match, 1))
find_match_alt
Or we could've done this, which may help to explain what the query
string is doing above
def find_match_alt(x):
"""Alternative to OP's"""
date_is_afterwards = dfDays.index > x.name
price_target_is_met = dfDays.price >= x.target
both_are_true = price_target_is_met & date_is_afterwards
if (both_are_true).any():
return dfDays[both_are_true].index[0]
dfWeeks.assign(target_hit=dfWeeks.apply(find_match_alt, 1))
Comparing these two functions should give good perspective.