J-F, to find non-compliant data with high ascii, i think you can just use this:
select * from @test-2 where value like '%[^''a-Z .,-]%'
i think you'd use a Tally table if you were stripping out the offending chars;
here's some test code and a Tally function i just slapped together:
select * from @test-2 where value like '%[^''a-Z .,-]%'
select * from @test-2 where value like '%[^' + CHAR(128) + '-' + CHAR(255) + ']%'
select dbo.StripHighAscii(value) from @test-2
with a tally table: this version strips out any non-expected characters also though, not just high ascii, but it's obvious what you can change:
ALTER FUNCTION StripHighAscii(@OriginalText VARCHAR(max))
RETURNS VARCHAR(8000)
BEGIN
DECLARE @CleanedText VARCHAR(max)
;WITH E01(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1), -- 10 or 10E01 rows
E02(N) AS (SELECT 1 FROM E01 a, E01 b), -- 100 or 10E02 rows
E04(N) AS (SELECT 1 FROM E02 a, E02 b), -- 10,000 or 10E04 rows
E08(N) AS (SELECT 1 FROM E04 a, E04 b), --100,000,000 or 10E08 rows
E16(N) AS (SELECT 1 FROM E08 a, E08 b), --10E16 or more rows than you'll EVER need
Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E16)
SELECT @CleanedText = ISNULL(@CleanedText,'') +
CASE
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) IN (9,10,13) THEN SUBSTRING(@OriginalText,Tally.N,1) --tab,lf,cr
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 32 AND 127 THEN SUBSTRING(@OriginalText,Tally.N,1)
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) <= 32 THEN ''
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) >= 128 THEN ''
END
FROM Tally WHERE Tally.N <= LEN(@OriginalText)
RETURN @CleanedText
END
another saved StripHighAscii for reference:
create function [dbo].[StripHighAscii](@originaltext varchar(8000))
returns varchar(8000)
begin
declare @i int
set @i=127
while @i < 255
begin
SET @originaltext = replace(@originaltext,CHAR(@i),'')
SET @i=@i +1
end
return @originaltext
end
Lowell