spark dataframe to pandas dataframe code example
Example 1: convert pandas dataframe to spark dataframe
import pandas as pd
from pyspark.sql import SparkSession
filename = <'path to file'>
spark = SparkSession.build.appName('pandasToSpark').getOrCreate()
pandas_df = pd.read_csv(filename)
spark_df = spark.CreateDataFrame(pandas_df)
Example 2: create spark dataframe from pandas
import numpy as np
import pandas as pd
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
pdf = pd.DataFrame(np.random.rand(100, 3))
df = spark.createDataFrame(pdf)