Clone/Deep-Copy a Spark DataFrame
Dataframes are immutable. That means you don't have to do deep-copies, you can reuse them multiple times and on every operation new dataframe will be created and original will stay unmodified.
For example:
val df = List((1),(2),(3)).toDF("id")
val df1 = df.as("df1") //second dataframe
val df2 = df.as("df2") //third dataframe
df1.join(df2, $"df1.id" === $"df2.id") //fourth dataframe and df is still unmodified
It seems like a waste of resources, but since all data in dataframe is also immutable, then all four dataframes can reuse references to objects inside them.