How to save result of printSchema to a file in PySpark
You need treeString
(which for some reason, I couldn't find in the python API)
#v will be a string
v = df._jdf.schema().treeString()
You can convert it to a RDD and use saveAsTextFile
sc.parallelize([v]).saveAsTextFile(...)
Or use Python specific API to write a String to a file.