Using map/reduce for mapping the properties in a collection
With Gates VP's and Kristina's answers as inspiration, I created an open source tool called Variety which does exactly this: https://github.com/variety/variety
Hopefully you'll find it to be useful. Let me know if you have questions, or any issues using it.
I solved problem #2 stated by Gates where for example data.0, data.1, data.2 was returned. Even though these are valid keys as stated above, I wanted to get rid of them for presentation purposes. I solved it by a quick edit in the m_sub function as shown below.
const m_sub = function (base, value) {
for (var key in value) {
if(key != "_id" && isNaN(key)){
emit(base + "." + key, null);
if (isArray(value[key]) || typeof value[key] == 'object') {
m_sub(base + "." + key, value[key]);
}
}
}
This change also has the above solution for problem #1 implemented and the only change made is in the first if-statement where I changed this:
if(key != "_id")
To this using the isNaN(x) function:
if(key != "_id" && isNaN(key))
Hope this helps someone, and if there is a problem with this solution please give feedback!
OK, this is a little more complex because you'll need to use some recursion.
To make the recursion happen, you'll need to be able to store some functions on the server.
Step 1: define some functions and put them server-side
isArray = function (v) {
return v && typeof v === 'object' && typeof v.length === 'number' && !(v.propertyIsEnumerable('length'));
}
m_sub = function(base, value){
for(var key in value) {
emit(base + "." + key, null);
if( isArray(value[key]) || typeof value[key] == 'object'){
m_sub(base + "." + key, value[key]);
}
}
}
db.system.js.save( { _id : "isArray", value : isArray } );
db.system.js.save( { _id : "m_sub", value : m_sub } );
Step 2: define the map and reduce functions
map = function(){
for(var key in this) {
emit(key, null);
if( isArray(this[key]) || typeof this[key] == 'object'){
m_sub(key, this[key]);
}
}
}
reduce = function(key, stuff){ return null; }
Step 3: run the map reduce and look at results
mr = db.runCommand({"mapreduce" : "things", "map" : map, "reduce" : reduce,"out": "things" + "_keys"});
db[mr.result].distinct("_id");
The results you'll get are:
["_id", "_id.isObjectId", "_id.str", "_id.tojson", "egg", "egg.0", "foo", "foo.bar", "foo.bar.baaaar", "hello", "type", "type.0", "type.1"]
There's one obvious problem here, we're adding some unexpected fields here: 1. the _id data 2. the .0 (on egg and type)
Step 4: Some possible fixes
For problem #1 the fix is relatively easy. Just modify the map
function. Change this:
emit(base + "." + key, null); if( isArray...
to this:
if(key != "_id") { emit(base + "." + key, null); if( isArray... }
Problem #2 is a little more dicey. You wanted all keys and technically "egg.0" is a valid key. You can modify m_sub
to ignore such numeric keys. But it's also easy to see a situation where this backfires. Say you have an associative array inside of a regular array, then you want that "0" to appear. I'll leave the rest of that solution up to you.