How to parse a csv that uses ^A (i.e. \001) as the delimiter with spark-csv?
If you check the GitHub page, there is a delimiter
parameter for spark-csv (as you also noted).
Use it like this:
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true") // Use first line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.option("delimiter", "\u0001")
.load("cars.csv")
With Spark 2.x and the CSV API, use the sep
option:
val df = spark.read
.option("sep", "\u0001")
.csv("path_to_csv_files")