Hadoop: How to unit test FileSystem
If you're using hadoop 2.0.0 and above - consider using a hadoop-minicluster
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-minicluster</artifactId>
<version>2.5.0</version>
<scope>test</scope>
</dependency>
With it, you can create a temporary hdfs on your local machine, and run your tests on it. A setUp method may look like this:
baseDir = Files.createTempDirectory("test_hdfs").toFile().getAbsoluteFile();
Configuration conf = new Configuration();
conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, baseDir.getAbsolutePath());
MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(conf);
hdfsCluster = builder.build();
String hdfsURI = "hdfs://localhost:"+ hdfsCluster.getNameNodePort() + "/";
DistributedFileSystem fileSystem = hdfsCluster.getFileSystem();
And in a tearDown method you should shut down your mini hdfs cluster, and remove temporary directory.
hdfsCluster.shutdown();
FileUtil.fullyDelete(baseDir);
Why not use a mocking framework like Mockito or PowerMock to mock your interations with the FileSystem? Your unit tests should not depend on an actual FileSystem, but should just be verifying behavior in your code in interacting with the FileSystem.
Take a look at the hadoop-test
jar
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-test</artifactId>
<version>0.20.205.0</version>
</dependency>
it has classes for setting up a MiniDFSCluster
and MiniMRCluster
so you can test without Hadoop
One possible way would be to use TemporaryFolder in Junit 4.7.
See.: http://www.infoq.com/news/2009/07/junit-4.7-rules or http://weblogs.java.net/blog/johnsmart/archive/2009/09/29/working-temporary-files-junit-47.