Binding list to params in Pandas read_sql_query with other params
WARNING! Although my proposed solution here works, it is prone to SQL injection attacks. Therefor, it should never be used directly in backend code! It is only safe for offline analysis.
If you're using python 3.6+ you could also use a formatted string litteral for your query (cf https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep498)
start, end = 201601, 201603
selected_members = (111, 222, 333, 444, 555) # requires to be a tuple
query = f"""
SELECT member_id, yearmonth FROM queried_table
WHERE yearmonth BETWEEN {start} AND {end}
AND member_id IN {selected_members}
"""
df = pd.read_sql_query(query, db2conn)
Break this up into three parts to help isolate the problem and improve readability:
- Build the SQL string
- Set parameter values
- Execute pandas.read_sql_query
Build SQL
First ensure ?
placeholders are being set correctly. Use str.format with str.join and len to dynamically fill in ?
s based on member_list
length. Below examples assume 3 member_list
elements.
Example
member_list = (1,2,3)
sql = """select member_id, yearmonth
from queried_table
where yearmonth between {0} and {0}
and member_id in ({1})"""
sql = sql.format('?', ','.join('?' * len(member_list)))
print(sql)
Returns
select member_id, yearmonth
from queried_table
where yearmonth between ? and ?
and member_id in (?,?,?)
Set Parameter Values
Now ensure parameter values are organized into a flat tuple
Example
# generator to flatten values of irregular nested sequences,
# modified from answers http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
def flatten(l):
for el in l:
try:
yield from flatten(el)
except TypeError:
yield el
params = tuple(flatten((201601, 201603, member_list)))
print(params)
Returns
(201601, 201603, 1, 2, 3)
Execute
Finally bring the sql
and params
values together in the read_sql_query
call
query = pd.read_sql_query(sql, db2conn, params)