Return a new RDD containing only the elements that satisfy a predicate. code example
Example: Return a new RDD containing only the elements that satisfy a predicate.
rdd = sc.parallelize([1, 2, 3, 4, 5])
rdd.filter(lambda x: x % 2 == 0).collect()
# [2, 4]