Explicit cast reading .csv with case class Spark 2.1.0
You just need to explicitly cast your field to a Double
:
val orderDetails = spark.read
.option("header","true")
.csv( inputFiles + "NW-Order-Details.csv")
.withColumn("unitPrice", 'UnitPrice.cast(DoubleType))
.as[OrderDetails]
On a side note, by Scala (and Java) convention, your case class constructor parameters should be lower camel case:
case class OrderDetails(orderID: String,
productID: String,
unitPrice: Double,
qty: Int,
discount: Double)
If we want to change the datatype for multiple columns; if we use withColumn option it will look ugly. The better way to apply schema for the data is
- Get the Case Class schema using Encoders as shown below
val caseClassschema = Encoders.product[CaseClass].schema
- Apply this schema while reading data
val data = spark.read.schema(caseClassschema)