Django-ORM: distinct is needed. Why?

Raw MySQL query looks like this:

SELECT user.id, group.id FROM user LEFT JOIN group ON user.group_id = group.id

The result will contain all possible combinations of users and groups and I guess some users belong to more than one group.


You are trying to fetch all users from all groups, but a user can present in multiple groups that's why distinct is required. if you want users ina specific group instead of doing an all try a filter query.


I assume that User.groups is a ForeignKey or some other relationship that associates each User with zero to many Group instances.

So the query which confuses you:

User.objects.filter(groups__in=Group.objects.all())

That query can be described as:

  • Access the Group model manager (Group.objects).
  • Make a QuerySet:
    • Return all Group instances (Group.objects.all()).
  • Access the User model manager (User.objects).
  • Make a Queryset:
    • Join to the Group model, on the User.groups foreign key.
    • Return every (User + Group) row which has an associated Group.

That is not “all users which are in a group”; instead, it is “All user–group pairs where the group exists”.

By querying on each of the multiple-value User.groups field, you are implying that the query must contain a join from User to Group rows.


Instead, you want:

  • Access the User model manager (User.objects).
  • Make a QuerySet:
    • Return all rows which have groups not empty.
User.objects.filter(groups__isnull=False)

Note that this – “All users which have a non-empty set of associated groups” – is the inverse of another example query you have (“All users which are in no group”).