Explicit cast reading .csv with case class Spark 2.1.0

You just need to explicitly cast your field to a Double:

val orderDetails = spark.read
   .option("header","true")
   .csv( inputFiles + "NW-Order-Details.csv")
   .withColumn("unitPrice", 'UnitPrice.cast(DoubleType))
   .as[OrderDetails]

On a side note, by Scala (and Java) convention, your case class constructor parameters should be lower camel case:

case class OrderDetails(orderID: String, 
                        productID: String, 
                        unitPrice: Double,
                        qty: Int, 
                        discount: Double)

If we want to change the datatype for multiple columns; if we use withColumn option it will look ugly. The better way to apply schema for the data is

Get the Case Class schema using Encoders as shown below

val caseClassschema = Encoders.product[CaseClass].schema

Apply this schema while reading data

val data = spark.read.schema(caseClassschema)

Explicit cast reading .csv with case class Spark 2.1.0

Tags:

Csv

Scala

Apache Spark

Related

Recent Posts