ok, i just googled "Non UTF-8 Characters", and the only thing i can find is when something got malformed; otherwise, It looks like UTF covers teh whole range of possible characters.
http://stackoverflow.com/questions/1379416/insert-utf8-data-into-a-ms-sql-server-2008
http://magp.ie/2011/01/06/remove-non-utf8-characters-from-string-with-php/
the two links above are describing fixes for malformed strings;
can you be more specific about what it is you want to remove?
do you really mean high ascii characters, ie >127 like some of these?
('ÀAlbèert ËEîinstêeiìn ÌInstìitúutëe - MPG')
do you mean escaping special characters that xml requires to be htmlized? like > to &l t ;?
Lowell