Selecting Non-Standard Characters

  • My application imports much data from files created by analysts in Excel. I want to be able to select those non-standard characters that are not within a band of ASCII numbers to weed out invalid numbers but am uncertain how that would be done. Items such a a copywright mark or other non-keyboard characters. Any assistance would be appreciated.

  • This also applies to identifying lower case alpha character like l, o that are supposed to be numeric. Selected depending on case. Since SQL Server is not case sensitive the command "LIKE N'%[a-z]%" will select all that have alpha characters.

  • Use ASCII to get the range of characters, e.g from A-Z and a-z. Here is a User-Defined-Function (UDF) that use to check whether the string is consists of valid characters.

    CREATE FUNCTION ISALPHA(@STR VARCHAR(100))

    RETURNS VARCHAR(100)

    BEGIN

    DECLARE

    @COUNTER AS INT,

    @RETBOOL AS TINYINT

    SET @COUNTER = 1

    WHILE @COUNTER <= DATALENGTH(@STR)

    BEGIN

    IF ASCII(SUBSTRING(@STR, @COUNTER, 1)) >= 65 AND ASCII(SUBSTRING(@STR, @COUNTER, 1)) <= 122

    BEGIN

    SET @RETBOOL = 1

    END

    ELSE

    BEGIN

    SET @RETBOOL = 0

    BREAK

    END

    SET @COUNTER = @COUNTER + 1

    END

    RETURN @RETBOOL

    END

    You can use the UDF like below:

    SELECT

    CASE

    WHEN DBO.ISALPHA(<FIELD NAME>) = 1 THEN <FIELD NAME>

    ELSE ''

    END

    FROM <TABLE NAME>



    Regards,
    kokyan

  • Also, to make the field case sensitive you can use the COLLATE clause on the field...

     
    
    IF EXISTS(SELECT * FROM sysobjects WHERE id = object_id('foo'))
    DROP TABLE foo
    CREATE TABLE foo
    (
    strWord nvarchar(300) COLLATE Latin1_General_CI_AI
    )
    INSERT INTO foo
    VALUES ('First')

    INSERT INTO foo
    VALUES ('first')

    INSERT INTO foo
    VALUES ('sEcond')

    INSERT INTO foo
    VALUES ('Second')

    SELECT DISTINCT strWord --COLLATE Latin1_General_BIN
    FROM foo
    SELECT DISTINCT strWord COLLATE Latin1_General_BIN
    FROM foo
    DROP TABLE foo

    Gary Johnson

    Microsoft Natural Language Group

    DBA, Sr. DB Engineer




    Gary Johnson
    Microsoft Natural Language Group
    DBA, Sr. DB Engineer

    This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply