Find out the partition no/id
Indeed, the mapParitionsWithIndex
will give you an iterator & the partition index. (This isn't the same as reduce of course, but you could combine the result of that with aggregate
).
You can also use
TaskContext.getPartitionId()
e.g., in lieu of the presently missing foreachPartitionWithIndex()
https://github.com/apache/spark/pull/5927#issuecomment-99697229