Query CosmosDb - where array contains item(s) from array

Per my experience , expr in ARRAY_CONTAINS (arr_expr, expr [, bool_expr]) method is not supported list arguments.

According to your situation , I suggest you use UDF in Cosmos DB.

I created 3 sample documents as your description.

[
  {
    "id": "1",
    "zip": [
      {
        "code": "1111"
      },
      {
        "code": "2222"
      }
    ]
  },
  {
    "id": "2",
    "zip": [
      {
        "code": "2222"
      },
      {
        "code": "3333"
      }
    ]
  },
  {
    "id": "3",
    "zip": [
      {
        "code": "4444"
      },
      {
        "code": "1111"
      },
      {
        "code": "2222"
      }
    ]
  }
]

Please refer to the snippet of UDF code as below :

function test(zipcode){
    var arrayList = ["1111","2222"]
    var ret = false ;
    for(var i=0 ;i <zipcode.length;i++){
        if(arrayList.indexOf(zipcode[i].code)){
            ret= true;
        }else{
            ret = false;
            break;
        }
    }
    return ret;
}

You could select zip array (select c.zip from c) ,then loop the results and invoke the UDF above in your code with the zip[i] arguments.

Hope it helps you.


Just for summary:

Use the IN operator from Cosmos DB SQL APIs to query entry which is included in the list condition.

Like

SELECT * FROM c WHERE c.ZipCodes[0].Code IN ("6500", "6700")

Or

SELECT DISTINCT c FROM c JOIN zc IN c.ZipCodes WHERE zc.Code IN ("2720", "2610")

You can do something like this: For each item in ZipCodes, you get a zip and compare with the array of codes you are checking. This, IMHO, is much better than using UDF.

{
  query: '
         SELECT DISTINCT value r
         FROM root r
         JOIN zip IN r.zipCodes
         WHERE ARRAY_CONTAINS(@zipIds, zip, true)
         ',
  parameters: [{name: "@zipIds", value: zipIds}]
}

The last param of ARRAY_CONTAINS tells the function to accept partial matches.


Apart from the fact that using UDF looks as the easier option, i would not use UDFs in your query's filter, since it compromises the performance of your query. I faced the same problem in my work environment, where things are designed to use UDFs to help in the queries, but the reality is that most of the times we are doing queries by using single values, and using UDF will actually result on the query not using the index. So in that case if you need to validate multiple values in the array, depending on the volume of values you need to validate, you can always write something like ARRAY_CONTAINS(c, 1) or ARRAY_CONTAINS(c, 2) or ....

It doesn't look so elegant solution, but will ensure that it will use the index and will do the best performance in your query.