setup and cleanup methods of Mapper/Reducer in Hadoop MapReduce

One clarification is helpful. The setup/cleanup methods are used for initialization and clean up at task level. Within a task, first initialization happens with a single call to setup() method and then all calls to map() [or reduce()] function will be done. After that another single call will be made to cleanup() method before exiting the task.


They are called for each task, so if you have 20 mappers running, the setup / cleanup will be called for each one.

One gotcha is the standard run method for both Mapper and Reducer does not catch exceptions around the map / reduce methods - so if an exception is thrown in these methods, the clean up method will not be called.

2020 Edit: As noted in the comments, this statement from 2012 (Hadoop 0.20) is no longer true, the cleanup is called as part of a finally block.


It's called per Mapper task or Reducer task. Here is the hadoop code.

public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    try {
      while (context.nextKey()) {
        reduce(context.getCurrentKey(), context.getValues(), context);
      }
    } finally {
      cleanup(context);
    }
  }