Apache Spark reads for S3: can't pickle thread.lock objects

Your s3_client isn't serialisable.

Instead of flatMap use mapPartitions, and initialise s3_client inside the lambda body to avoid overhead. That will: