Django-ORM: distinct is needed. Why?
Raw MySQL query looks like this:
SELECT user.id, group.id FROM user LEFT JOIN group ON user.group_id = group.id
The result will contain all possible combinations of users and groups and I guess some users belong to more than one group.
You are trying to fetch all users from all groups, but a user can present in multiple groups that's why distinct is required. if you want users ina specific group instead of doing an all try a filter query.
I assume that User.groups
is a ForeignKey
or some other relationship that associates each User
with zero to many Group
instances.
So the query which confuses you:
User.objects.filter(groups__in=Group.objects.all())
That query can be described as:
- Access the
Group
model manager (Group.objects
). - Make a
QuerySet
:- Return all
Group
instances (Group.objects.all()
).
- Return all
- Access the
User
model manager (User.objects
). - Make a
Queryset
:- Join to the
Group
model, on theUser.groups
foreign key. - Return every (
User
+Group
) row which has an associatedGroup
.
- Join to the
That is not “all users which are in a group”; instead, it is “All user–group pairs where the group exists”.
By querying on each of the multiple-value User.groups
field, you are implying that the query must contain a join from User
to Group
rows.
Instead, you want:
- Access the
User
model manager (User.objects
). - Make a
QuerySet
:- Return all rows which have
groups
not empty.
- Return all rows which have
User.objects.filter(groups__isnull=False)
Note that this – “All users which have a non-empty set of associated groups” – is the inverse of another example query you have (“All users which are in no group”).