Cannot Read a file from HDFS using Spark
This will work:
val textFile = sc.textFile("hdfs://localhost:9000/user/input.txt")
Here, you can take localhost:9000
from hadoop core-site.xml
config file's fs.defaultFS
parameter value.
Here is the solution
sc.textFile("hdfs://nn1home:8020/input/war-and-peace.txt")
How did I find out nn1home:8020?
Just search for the file core-site.xml
and look for xml element fs.defaultFS
You are not passing a proper url string.
hdfs://
- protocol typelocalhost
- ip address(may be different for you eg. - 127.56.78.4)54310
- port number/input/war-and-peace.txt
- Complete path to the file you want to load.
Finally the URL should be like this
hdfs://localhost:54310/input/war-and-peace.txt
if you want to use sc.textFile("hdfs://...")
you need to give the full path(absolute path), in your example that would be "nn1home:8020/.."
If you want to make it simple, then just use sc.textFile("hdfs:/input/war-and-peace.txt")
That's only one /