how to select rows that contains non-english characters in sql server 2005(it should filter only non-english chars, not special characters)
T-SQL's string-handling capability is pretty rudimentary.
If the "non-English" fields are distinguished by their use of Unicode UTF-16, you can try something like
SELECT * FROM MyTable WHERE MyField = Cast(MyField AS VARCHAR)
to pull only rows that are expressible in UTF-8.
The only way I know how to test whether a field is drawn from an arbitrary set of characters is with a user-defined function, like this:
CREATE FUNCTION IsAllowed (@input VARCHAR(MAX)) RETURNS BIT
-- Returns 1 if string is allowed, 0 otherwise.
-- Usages: SELECT dbo.IsAllowed('Hello'); -- returns 1
-- SELECT dbo.IsAllowed('Hello, world!'); -- returns 0
-- Note CHARINDEX is not case sensitive so @allowables doesn't need both.
-- VARCHAR(MAX) is different under SQL Server 2005 than 2008+
--- and use of defined VARCHAR size might be necessary.
AS
BEGIN
DECLARE @allowables char(26) = 'abcdefghijklmnopqrstuvwxyz';
DECLARE @allowed int = 0;
DECLARE @index int = 1;
WHILE @index <= LEN(@input)
BEGIN
IF CHARINDEX(SUBSTRING(@input,@index,1),@allowables)=0
BEGIN
SET @allowed = 0;
BREAK;
END
ELSE
BEGIN
SET @allowed = 1;
SET @index = @index+1;
END
END
RETURN @allowed
END
User-defined functions can be applied to columns in SELECT, like this:
SELECT * FROM MyTable WHERE dbo.IsAllowed(MyField) = 1
Note the schema name (dbo
in this case) is not optional with user-defined functions.
If a T-SQL user-defined function is inadequate, you can also use a CLR Function. Then you could apply a regexp or whatever to a column. Because they break portability and pose a security risk, many sysadmins don't allow CLR functions. (This includes Microsoft's SQL Azure product.)
If you have all the allowed special characters, following select statement should select all columns with only English and the allowed special characters:
select column_name from table_name where column_name like '%[^a-z, .-™]%';
You can add all the allowed special characters inside the square brackets.
Got a great answer for my question.
select ID, LATITUDE, LONGITUDE, REFERENCE, LOCATION, VALIDATE,
patindex('%[^ !-~()"]%' COLLATE Latin1_General_BIN,LOCATION) as [Position],
substring(LOCATION,patindex('%[^ !-~()"]%' COLLATE Latin1_General_BIN,LOCATION),1) as [InvalidCharacter],
ascii(substring(LOCATION,patindex('%[^ !-~()"]%' COLLATE Latin1_General_BIN,LOCATION),1)) as [ASCIICode]from dbo.RADAR_SEARCH where patindex('%[^ !-~()"]%' COLLATE Latin1_General_BIN,LOCATION) >0
EDIT1: Explanation for above answer
Above query filters only non-English chars in Location column in which special characters are not included.
NOTE: Tested only in MS-SQL.