SOQL query selectivity changes with number of IN clause records
Further analysis done. This is a child object with total cardinality of 3.7 million records which is being filtered based on parent field.
Below are the observations:
- The query cost is directly proportional to the number of records retrieved from database for that particular type of field (parent lookup field in my case).
IN
operator is virtuallyOR
conditions in terms of cost.ParentId IN ('id1','id2')
is same asParentId='id1' OR ParentId='id2'
and so the cost simply adds up.- As observed above each record retrieving cost is
0.00000427332165921
which is constant for that parent field filter. It is always the multiple of this number to the total records being retrieved from database either by=
orIN
operator.
Now coming to your use case:
From your screenshot,
Cost of each child record is 0.0000483956831
for given Parent Opportunity field AB2__Opportunity__c
.
However, if I have two Opportunities the cost raises to 0.00508.
This makes sense as the number of child records for those 2 opportunities is 105
whereas the first opportunity has only 5
child records.
Considering this, I think it is better to check how many child records is being returned when you are using 200 opportunity ids. (It should be much more than 3 child records per opportunity)
Firstly, the answer provided by salesforcesas is correct. It's the accumulation of cardinality records for each opportunity Id in the IN clause that is causing the issue.
I'd assumed that each Opportunity has 3 related child records that match the SOQL filter. And this was correct, but...
I hadn't provided the full story within the question. The actual SOQL query looks more like this:
select Id, Name, SomeField__c
from OppChild__c
where
Opportunity__c in :triggerOpportunityIds and ObjectType__c = 'Order'
It's that and ObjectType__c = 'Order'
that is the killer here. When I run the actual SOQL query there are indeed 3 records per Opportunity. What I wasn't taking into account are the records where ObjectType__c != 'Order'
and the lack of indexing for that field.
There can be hundreds or even thousands of records with an object type other than Order for each opportunity. Doh - Data Skew! Of course the cardinality count will include those records in the index on Opportunity ID alone. It even explains the odd increases in relative cost and cardinality I was seeing in the graphs I produced. There was a widely varying number of related records that depended on the Opportunity ID's being used.
To make matters worse, the ObjectType__c field is a formula to a related object field, so it isn't available for custom indexing.
I'll need to revisit this custom object so that the ObjectType__c can be included in the index.