How can I flatten double arrays in mongoDB?
You need perform an aggregation operation with two unwind stages and a single group stage. The basic rule being you unwind as many times as the level of nest depth. Here the level of nesting is 2, so we unwind two times.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"}}}
])
The first $unwind
stage produces the result:
{
"_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
"Countries" : [
"Spain",
"France"
]
}
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Spain"
]
}
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Russia",
"Egypt"
]
}
The second $unwind
stage further flattens the Countries
array:
{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "France" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Russia" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Egypt" }
Now the final $group
stage groups the records based on the _id
,and accumulates the country names in a single array.
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Spain",
"Russia",
"Egypt"
]
}
{
"_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
"Countries" : [
"Spain",
"France"
]
}
If you wish to keep other fields in the document then you need to explicitly specify the names of the fields other than the country field,(field1,field2,etc..), using the $first
operator. You can write/overwrite a collection by specifying the name of the collection in the $out
stage.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"},
"field1":{$first => "$field1"}}},
{$out => "collection"}
])
You need to explicitly specify the fields so that you don't get a redundant Countries
field.
You can use the $$ROOT
system variable to store the entire document, but that would make the Countries
field redundant.One outside the doc
and one inside the doc
.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"},
"doc":{$first => "$$ROOT"}}},
{$out => "collection"}
])
In Mongo 3.4+ you can use $reduce
to flatten 2d arrays.
db.collection.aggregate(
[
{
$project: {
"countries": {
$reduce: {
input: '$Countries',
initialValue: [],
in: {$concatArrays: ['$$value', '$$this']}
}
}
}
}
]
)
Docs: https://docs.mongodb.com/manual/reference/operator/aggregation/reduce/
Try this:
db.test2.aggregate([
{"$unwind" : "$Countries"},
{"$unwind" : "$Countries"},
{$group : { _id : '$_id', Countries: { $addToSet: "$Countries" }}},
]).result
Your data for Countries are not in a good format, so you may consider to convert them. This is a script to flatten the array in Countries field and save it the origin documents that you can run in a mongo shell:
function flattenArray(inArr) {
var ret = [];
inArr.forEach(function(arr) {
if (arr.constructor.toString().indexOf("Array") > -1) {
ret = ret.concat(flattenArray(arr));
} else {
ret.push(arr);
}
});
return ret;
}
db.collection.find({
'Countries': {
'$exists': true
}
}).forEach(function(doc){
doc.Countries = flattenArray(doc.Countries);
db.collection.save(doc);
});