load parquet file to s3 directly from dataframe code example
Example 1: read parquet from s3 and convert to dataframe
import pyarrow.parquet as pq
import s3fs
dataset = pq.ParquetDataset('s3://<s3_path_to_folder_or_file>',
filesystem=s3fs.S3FileSystem(), filters=[('colA', '=', 'some_value'), ('colB', '>=', some_number)])
table = dataset.read()
df = table.to_pandas()
Example 2: pandas read parquet from s3
s3_url = 's3://bucket/folder/bucket.parquet.gzip'
df.to_parquet(s3_url, compression='gzip')