pyspark rdd example

Example: parallelize in pyspark example

----------------------------------------foreach.py---------------------------------------
from pyspark import SparkContext
sc = SparkContext("local", "ForEach app")
words = sc.parallelize (
   ["scala", 
   "java", 
   "hadoop", 
   "spark", 
   "akka",
   "spark vs hadoop", 
   "pyspark",
   "pyspark and spark"]
)
def f(x): print(x)
fore = words.foreach(f) 
----------------------------------------foreach.py---------------------------------------
Output:

scala
java
hadoop
spark
akka
spark vs hadoop
pyspark
pyspark and spark

Tags:

Misc Example