Select columns in PySpark dataframe
Try something like this:
df.select([c for c in df.columns if c in ['_2','_4','_5']]).show()
First two columns and 5 rows
df.select(df.columns[:2]).take(5)
Try something like this:
df.select([c for c in df.columns if c in ['_2','_4','_5']]).show()
First two columns and 5 rows
df.select(df.columns[:2]).take(5)