mongodb mongoTemplate get distinct field with some criteria

As of Spring Data Mongo 2.2.0 MongoTemplate provides a function to retrieve the distinct field with criteria,

Criteria criteria = new Criteria("country").is("IN");
Query query = new Query();
query.addCriteria(criteria);
return mongoTemplate.findDistinct(query,"city",Address.class,String.class);

Which basically finds all the distinct cities in address collection where country is IN.


For one thing the .getCollection() method returns the basic Driver collection object like so:

DBCollection collection = mongoTemplate.getCollection("collectionName");

So the type of query object might be different from what you are using, but there are also some other things. Namely that .distinct() only returns the "distint" values of the key that you asked for, and doe not return other fields of the document. So you could do:

Criteria criteria = new Criteria();
criteria.where("dataset").is("d1");
Query query = new Query();
query.addCriteria(criteria);
List list = mongoTemplate.getCollection("collectionName")
    .distinct("source",query.getQueryObject());

But that is only going to return "sample" as a single element in the list for instance.

If you want the "fields" from a distinct set then use the .aggregate() method instead. With either the "first" occurances of the other field values for the distinct key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id","$source")
                .append("name",new BasicDBObject("$first","$name"))
                .append("description", new BasicDBObject("$first","$description"))
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

Or the actual "distinct" values of multiple fields, by making them all part of the grouping key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("source","$source")
                    .append("name","$name")
                    .append("description","$description")
            )
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

There are also a direct .aggregate() method on mongoTemplate instances already, which has a number of helper methods to build pipelines. But this should point you in the right direction at least.