Using map/reduce for mapping the properties in a collection

With Gates VP's and Kristina's answers as inspiration, I created an open source tool called Variety which does exactly this: https://github.com/variety/variety

Hopefully you'll find it to be useful. Let me know if you have questions, or any issues using it.


I solved problem #2 stated by Gates where for example data.0, data.1, data.2 was returned. Even though these are valid keys as stated above, I wanted to get rid of them for presentation purposes. I solved it by a quick edit in the m_sub function as shown below.

const m_sub = function (base, value) {
for (var key in value) {
    if(key != "_id" && isNaN(key)){
        emit(base + "." + key, null);
        if (isArray(value[key]) || typeof value[key] == 'object') {
            m_sub(base + "." + key, value[key]);
        }
    }
}

This change also has the above solution for problem #1 implemented and the only change made is in the first if-statement where I changed this:

if(key != "_id")

To this using the isNaN(x) function:

if(key != "_id" && isNaN(key))

Hope this helps someone, and if there is a problem with this solution please give feedback!


OK, this is a little more complex because you'll need to use some recursion.

To make the recursion happen, you'll need to be able to store some functions on the server.

Step 1: define some functions and put them server-side

isArray = function (v) {
  return v && typeof v === 'object' && typeof v.length === 'number' && !(v.propertyIsEnumerable('length'));
}

m_sub = function(base, value){
  for(var key in value) {
    emit(base + "." + key, null);
    if( isArray(value[key]) || typeof value[key] == 'object'){
      m_sub(base + "." + key, value[key]);
    }
  }
}

db.system.js.save( { _id : "isArray", value : isArray } );
db.system.js.save( { _id : "m_sub", value : m_sub } );

Step 2: define the map and reduce functions

map = function(){
  for(var key in this) {
    emit(key, null);
    if( isArray(this[key]) || typeof this[key] == 'object'){
      m_sub(key, this[key]);
    }
  }
}

reduce = function(key, stuff){ return null; }

Step 3: run the map reduce and look at results

mr = db.runCommand({"mapreduce" : "things", "map" : map, "reduce" : reduce,"out": "things" + "_keys"});
db[mr.result].distinct("_id");

The results you'll get are:

["_id", "_id.isObjectId", "_id.str", "_id.tojson", "egg", "egg.0", "foo", "foo.bar", "foo.bar.baaaar", "hello", "type", "type.0", "type.1"]

There's one obvious problem here, we're adding some unexpected fields here: 1. the _id data 2. the .0 (on egg and type)

Step 4: Some possible fixes

For problem #1 the fix is relatively easy. Just modify the map function. Change this:

emit(base + "." + key, null); if( isArray...

to this:

if(key != "_id") { emit(base + "." + key, null); if( isArray... }

Problem #2 is a little more dicey. You wanted all keys and technically "egg.0" is a valid key. You can modify m_sub to ignore such numeric keys. But it's also easy to see a situation where this backfires. Say you have an associative array inside of a regular array, then you want that "0" to appear. I'll leave the rest of that solution up to you.