How do scoring profiles generate scores in Azure Search?

Thanks for the providing the details. What were the base relevance scores of the two documents?

The boosting factor provided in the scoring profile is actually multiplied to the base relevance scores computed using term frequencies. For example, suppose that the base scores, given in @search.score in the response payload, of the two documents were 0.5 and 0.2 and the values in the weight column were 0.5465 and 0.5419 respectively. With the scoring profile configuration given above, with starting value of 0, ending value of 1, linear interpolation, and the boost factor of 1000. The final score you get for each document is computed as the following :

document 1 : base search_score(0.5) * boost_factor (1000) * (weight (0.5465) - min(0)) / max - min (1) = final_search_score(273.25)

document 2 : base_search_score(0.2) * boost_factor (1000) * (weight (0.5419) - min(0)) / max - min (1) = final_search_score(108.38)

Please let me know if the final scores you get do not agree with the function above. Thanks!

Nate


So the provided answer by Nate is difficult to understand and it misses some components. I have made an overview of the entire scoring process, and its quite complex.

So when an user executes a search a query is given to Azure Search. Azure search uses the TF-IDF algorithm to determine a score from 0-1 based on Tokens being formed by the Analyzer. Keep in mind that language specific analyzers can come up with multiple tokens for one word. For every searchable field the score will be produced and then multiplied by the weight in the scoring profile. Lastly all weighted scores will be summed up and that's the initial weighted score.

A scoring profile might also contain scoring functions. The scoring function can be either a magnitude, freshness, geo or tag based function. Multiple functions can be made within one scoring profile.

The functions will be evaluated and the score from the functions can be either summed up, or taken the average, minimum, maximum or first matching. The total of all functions is then multiplied by the total weighted score and that's the final score.

An example, this is an example index with scoring profile.

{  
  "name": "musicstoreindex",  
  "fields": [  
    { "name": "key", "type": "Edm.String", "key": true },  
    { "name": "albumTitle", "type": "Edm.String" },  
    { "name": "genre", "type": "Edm.String" },  
    { "name": "genreDescription", "type": "Edm.String", "filterable": false },  
    { "name": "artistName", "type": "Edm.String" },  
    { "name": "rating", "type": "Edm.Int32" },  
    { "name": "price", "type": "Edm.Double", "filterable": false },  
    { "name": "lastUpdated", "type": "Edm.DateTimeOffset" }  
  ],  
  "scoringProfiles": [  
    {  
      "name": "boostGenre",  
      "text": {  
        "weights": {  
          "albumTitle": 1.5,  
          "genre": 5,  
          "artistName": 2  
        }  
      }  
    },  
    {  
      "name": "newAndHighlyRated",  
      "functions": [  
        {  
          "type": "freshness",  
          "fieldName": "lastUpdated",  
          "boost": 10,  
          "interpolation": "linear",  
          "freshness": {  
            "boostingDuration": "P365D"  
          }  
        },  
        {
          "type": "magnitude",  
          "fieldName": "rating",  
          "boost": 8,  
          "interpolation": "linear",  
          "magnitude": {  
            "boostingRangeStart": 1,  
            "boostingRangeEnd": 5,  
            "constantBoostBeyondRange": false  
          }  
        }  
      ],
      "functionAggregation": 0
    }  
  ]
}

Lets say the entered query is meteora the famous album by Linkin Park. Lets say we have the following document in our index.

{
    "key": 123,
    "albumTitle": "Meteora",
    "genre": "Rock",
    "genreDescription": "Rock with a flick of hiphop",
    "artistName": "Linkin Park",
    "rating": 4,
    "price": 30,
    "lastUpdated": "2020-01-01" 
}

I'm not an expert on TF-IDF but I can imagine that the following unweighted score will be produced:

{
    "albumTitle": 1,
    "genre": 0,
    "genreDescription": 0,
    "artistName": 0
}

The scoring profile has a weight of 1.5 on the albumTitle field, so the total weighted score will be: 1 * 1.5 + 0 + 0 + 0 = 1.5

After that the scoring profile functions will be evaluated. In this case there are 2. The first one evaluates the freshness with a range of 365 days, one year. The last updated field has a value of the 1st of April this year. Lets say thats 50 days from now. The total range is 365 so you will get a score of 1 if the last updated date is today. And a 0 if its 365 days or more in the past. In our case its 1 - 50 / 365 = 0.8630... The boost of the function is 10 so the score for the first function is 8.630.

The second function is a magnitude function with a range from 1 to 5. The document got a 4 star rating so thats worth a score of 0.8, because a 1 star is 0 and 5 stars is 1. So a for 4 star is obviously 4 / 5 = 0.8. The boost of the magnitude function is 8 so we have to multiple the value with 8. 0.8 * 8 = 6.4.

The functionAggregation is 0, which means we have to sum the results of all functions. Giving us a total score of scoring profile functions of: 6.4 + 8.630 = 15.03. The rule is then to multiple the total scoring profile functions score with the total weighted score of the fields giving us a grand total of: 15.03 + 1.5 = 16.53.

Hope you enjoined this example.