Does MongoDB's $in clause guarantee order
As noted, the order of the arguments in the array of an $in clause does not reflect the order of how the documents are retrieved. That of course will be the natural order or by the selected index order as shown.
If you need to preserve this order, then you basically have two options.
So let's say that you were matching on the values of _id
in your documents with an array that is going to be passed in to the $in
as [ 4, 2, 8 ]
.
Approach using Aggregate
var list = [ 4, 2, 8 ];
db.collection.aggregate([
// Match the selected documents by "_id"
{ "$match": {
"_id": { "$in": [ 4, 2, 8 ] },
},
// Project a "weight" to each document
{ "$project": {
"weight": { "$cond": [
{ "$eq": [ "$_id", 4 ] },
1,
{ "$cond": [
{ "$eq": [ "$_id", 2 ] },
2,
3
]}
]}
}},
// Sort the results
{ "$sort": { "weight": 1 } }
])
So that would be the expanded form. What basically happens here is that just as the array of values is passed to $in
you also construct a "nested" $cond
statement to test the values and assign an appropriate weight. As that "weight" value reflects the order of the elements in the array, you can then pass that value to a sort stage in order to get your results in the required order.
Of course you actually "build" the pipeline statement in code, much like this:
var list = [ 4, 2, 8 ];
var stack = [];
for (var i = list.length - 1; i > 0; i--) {
var rec = {
"$cond": [
{ "$eq": [ "$_id", list[i-1] ] },
i
]
};
if ( stack.length == 0 ) {
rec["$cond"].push( i+1 );
} else {
var lval = stack.pop();
rec["$cond"].push( lval );
}
stack.push( rec );
}
var pipeline = [
{ "$match": { "_id": { "$in": list } }},
{ "$project": { "weight": stack[0] }},
{ "$sort": { "weight": 1 } }
];
db.collection.aggregate( pipeline );
Approach using mapReduce
Of course if that all seems to hefty for your sensibilities then you can do the same thing using mapReduce, which looks simpler but will likely run somewhat slower.
var list = [ 4, 2, 8 ];
db.collection.mapReduce(
function () {
var order = inputs.indexOf(this._id);
emit( order, { doc: this } );
},
function() {},
{
"out": { "inline": 1 },
"query": { "_id": { "$in": list } },
"scope": { "inputs": list } ,
"finalize": function (key, value) {
return value.doc;
}
}
)
And that basically relies on the emitted "key" values being in the "index order" of how they occur in the input array.
So those essentially are your ways of maintaining the order of a an input list to an $in
condition where you already have that list in a determined order.
Another way using the Aggregation query only applicable for MongoDB verion >= 3.4 -
The credit goes to this nice blog post.
Example documents to be fetched in this order -
var order = [ "David", "Charlie", "Tess" ];
The query -
var query = [
{$match: {name: {$in: order}}},
{$addFields: {"__order": {$indexOfArray: [order, "$name" ]}}},
{$sort: {"__order": 1}}
];
var result = db.users.aggregate(query);
Another quote from the post explaining these aggregation operators used -
The "$addFields" stage is new in 3.4 and it allows you to "$project" new fields to existing documents without knowing all the other existing fields. The new "$indexOfArray" expression returns position of particular element in a given array.
Basically the addFields
operator appends a new order
field to every document when it finds it and this order
field represents the original order of our array we provided. Then we simply sort the documents based on this field.
If you don't want to use aggregate
, another solution is to use find
and then sort the doc results client-side using array#sort
:
If the $in
values are primitive types like numbers you can use an approach like:
var ids = [4, 2, 8, 1, 9, 3, 5, 6];
MyModel.find({ _id: { $in: ids } }).exec(function(err, docs) {
docs.sort(function(a, b) {
// Sort docs by the order of their _id values in ids.
return ids.indexOf(a._id) - ids.indexOf(b._id);
});
});
If the $in
values are non-primitive types like ObjectId
s, another approach is required as indexOf
compares by reference in that case.
If you're using Node.js 4.x+, you can use Array#findIndex
and ObjectID#equals
to handle this by changing the sort
function to:
docs.sort((a, b) => ids.findIndex(id => a._id.equals(id)) -
ids.findIndex(id => b._id.equals(id)));
Or with any Node.js version, with underscore/lodash's findIndex
:
docs.sort(function (a, b) {
return _.findIndex(ids, function (id) { return a._id.equals(id); }) -
_.findIndex(ids, function (id) { return b._id.equals(id); });
});