Cannot Read a file from HDFS using Spark

This will work:

val textFile = sc.textFile("hdfs://localhost:9000/user/input.txt")

Here, you can take localhost:9000 from hadoop core-site.xml config file's fs.defaultFS parameter value.

Here is the solution


How did I find out nn1home:8020?

Just search for the file core-site.xml and look for xml element fs.defaultFS

You are not passing a proper url string.

  • hdfs:// - protocol type
  • localhost - ip address(may be different for you eg. -
  • 54310 - port number
  • /input/war-and-peace.txt - Complete path to the file you want to load.

Finally the URL should be like this


if you want to use sc.textFile("hdfs://...") you need to give the full path(absolute path), in your example that would be "nn1home:8020/.."

If you want to make it simple, then just use sc.textFile("hdfs:/input/war-and-peace.txt")

That's only one /