pyspark : Convert DataFrame to RDD[string]
PySpark Row
is just a tuple
and can be used as such. All you need here is a simple map
(or flatMap
if you want to flatten the rows as well) with list
:
data.map(list)
or if you expect different types:
data.map(lambda row: [str(c) for c in row])