Reading specific partitions from a partitioned parquet dataset with pyarrow
As of pyarrow version 0.10.0 you can use filters
kwarg to do the query. In your case it would look like something like this:
import pyarrow.parquet as pq
dataset = pq.ParquetDataset('path-to-your-dataset', filters=[('part2', '=', 'True'),])
table = dataset.read()
Ref
Question: How do I read specific partitions from a partitioned parquet dataset with pyarrow?
Answer: You can't right now.
Can you create an Apache Arrow JIRA requesting this feature on https://issues.apache.org/jira?
This is something that we should be able to support in the pyarrow API but it will require someone to implement it. Thank you