We store html fragments in varchar fields. I learned that smartquotes and emdash need special handling to display properly, so instead I applied special handling to remove them. I did not know the position or the character code in thousands of rows of content. I wrote the following to identify rows to be fixed and also what and where a fix was needed:
(assumes markup table has "content" varchar column; Tally has int field N)
select top 1000
m.[PKId]
,[position]= n.[N]
,[character]= substring(m.[markup], n.[N],1)
,[ascii]= ascii( substring(m.[markup], n.[N],1) )
from
markup m
join
(select [N] from Tally where [N] <= 255 ) n
on
ascii( substring(m.[content], n.[N],1) ) in
(
select
[ascii]= [N]
from
Tally
where
[n] not between ascii('A') and ascii('Z')
and
[n] not between ascii('a') and ascii('z')
and
[n] not between ascii('0') and ascii('9')
and
(
'@.-_' not like '%' + char([n]) + '%'
or
char([n]) = '%'
)
)