LINQ Where Ignore Accentuation and Case

To ignore case and accents (diacritics) you can first define an extension method like this:

    public static string RemoveDiacritics(this String s)
    {
        String normalizedString = s.Normalize(NormalizationForm.FormD);
        StringBuilder stringBuilder = new StringBuilder();

        for (int i = 0; i < normalizedString.Length; i++)
        {
            Char c = normalizedString[i];
            if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
                stringBuilder.Append(c);
        }

        return stringBuilder.ToString();
    }

(Modified from Ignoring accented letters in string comparison)

Now you can run your query:

string queryText = filter.ToUpper().RemoveDiacritics();

var result = from p in People
         where p.Name.ToUpper().RemoveDiacritics() == queryText
         select p;

This is fine if you are just iterating over a collection in C#, but if you are using LINQ to SQL it is preferable to avoid non-standard methods (including extension methods) in your LINQ query. This is because your code cannot be converted into valid SQL and hence run on SQL Server with all its lovely performance optimization.

Since there doesn't seem to be a standard way of ignoring accents within LINQ to SQL, in this case I would suggest changing the field type that you want to search to be case- and accent-insensitive (CI_AI).

With your example:

ALTER TABLE People ALTER COLUMN Name [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AI

Your query should now ignore accentuation and case.

Note that you will need to temporarily remove any unique constraints on the field before running the above query, e.g.

ALTER TABLE People DROP CONSTRAINT UQ_People_Name

Now your LINQ query would simply be:

var result = from p in People
         where p.Name == filter
         select p;

See related question here.


For accents, you can enumerate all of them (here for French language) if you can't update your db schema nor fetch the entire list in RAM:

var result = from p in People
             where p.Name.ToUpper()
                .Replace("à", "a")
                .Replace("â", "a")
                .Replace("ä", "a")
                .Replace("ç", "c")
                .Replace("é", "e")
                .Replace("è", "e")
                .Replace("ê", "e")
                .Replace("ë", "e")
                .Replace("î", "i")
                .Replace("ï", "i")
                .Replace("ô", "o")
                .Replace("ù", "u")
                .Replace("û", "u")
                .Replace("ü", "u").Contains(RemoveDiacritics(filter.ToUpper()))
             select p;

Change collate:

ALTER TABLE dbo.MyTable 
ALTER COLUMN CharCol varchar(10)**COLLATE Latin1_General_CI_AS** NOT NULL;