Hadoop fs lookup for block size?
Seems hadoop fs doesn't have options to do this.
But hadoop fsck could.
You can try this
$HADOOP_HOME/bin/hadoop fsck /path/to/file -files -blocks
Try to code below
path=hdfs://a/b/c
size=`hdfs dfs -count ${path} | awk '{print $3}'`
echo $size
I think it should be doable with:
hadoop fsck /filename -blocks
but I get Connection refused
The fsck
commands in the other answers list the blocks and allow you to see the number of blocks. However, to see the actual block size in bytes with no extra cruft do:
hadoop fs -stat %o /filename
Default block size is:
hdfs getconf -confKey dfs.blocksize
Details about units
The units for the block size are not documented in the hadoop fs -stat
command, however, looking at the source line and the docs for the method it calls we can see it uses bytes and cannot report block sizes over about 9 exabytes.
The units for the hdfs getconf
command may not be bytes. It returns whatever string is being used for dfs.blocksize
in the configuration file. (This is seen in the source for the final function and its indirect caller)