How to create an IPFS compatible multihash

A file in IPFS is 'transformed' into a Unixfs file, which is a representation of files in a DAG, in your example, you are hashing directly your multihash.txt with sha2-256, but what happens inside IPFS is:

  • file gets chunked into 256KiB pieces
  • each chunk goes into a DAG node inside a Unixfs protobuf https://github.com/ipfs/js-ipfs-unixfs
  • a dag is created with links to all the chunks.

IPFS uses multihash where the format is the following:

base58(<varint hash function code><varint digest size in bytes><hash function output>)

The list of hash function codes can be found in this table.

Here's some pseudocode of the process using SHA2-256 as the hashing function.

sha2-256   size  sha2-256("hello world")
0x12       0x20  0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

Concatenating those three items will produce

1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

Which then you encode it to base58

QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

Here's an example of how to essentially implement multihash in JavaScript:


    const crypto = require('crypto')
    const bs58 = require('bs58')
    
    const data = 'hello world'
    
    const hashFunction = Buffer.from('12', 'hex') // 0x20
    
    const digest = crypto.createHash('sha256').update(data).digest()
    
    console.log(digest.toString('hex')) // b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
    
    const digestSize = Buffer.from(digest.byteLength.toString(16), 'hex')
    
    console.log(digestSize.toString('hex')) // 20
    
    const combined = Buffer.concat([hashFunction, digestSize, digest])
    
    console.log(combined.toString('hex')) // 1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
    
    const multihash = bs58.encode(combined)
    
    console.log(multihash.toString()) // QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

There's a CLI you can use to generate multihashes:

    $ go get github.com/multiformats/go-multihash/multihash
    $ echo -n "hello world" | multihash -a sha2-256
    QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

As @David stated, A file in IPFS is "transformed" into a Unixfs "file", which is a representation of files in a DAG. So when you use add to upload a file to IPFS, the data has metadata wrapper which will give you a different result when you multihash it.

For example:

    $ echo -n "hello world" | ipfs add -Q
    Qmf412jQZiuVUtdgnB36FXFX7xg5V6KEbSJ4dpQuhkLyfD

Here's an example in Node.js of how to generate the exact same multihash as ipfs add:

    const Unixfs = require('ipfs-unixfs')
    const {DAGNode} = require('ipld-dag-pb')
    
    const data = Buffer.from('hello world', 'ascii')
    const unixFs = new Unixfs('file', data)
    
    DAGNode.create(unixFs.marshal(), (err, dagNode) => {
      if (err) return console.error(err)
    
      console.log(dagNode.toJSON().multihash) // Qmf412jQZiuVUtdgnB36FXFX7xg5V6KEbSJ4dpQuhkLyfD
    })

Tags:

Ipfs