pyspark : Convert DataFrame to RDD[string]

PySpark Row is just a tuple and can be used as such. All you need here is a simple map (or flatMap if you want to flatten the rows as well) with list:

data.map(list)

or if you expect different types:

data.map(lambda row: [str(c) for c in row])

pyspark : Convert DataFrame to RDD[string]

Tags:

Python

Dataframe

Apache Spark

Pyspark

Apache Spark Sql

Related

Recent Posts