How to validated a field for special characters in SSIS?

  • Hi,

    How can i check whether a particualr field from the OLEDB source contains any special characters in SSIS?

    eg:

    D123 - Valid

    143CO - Valid

    DU?234 - Invalid [special character '?' not allowed]

    Have to segregate the invalid data and push them to a log table. but how to check the values for special characters in SSIS?

    Thanks in advance

  • Not sure I follow you. Are you getting an error somewhere or just looking for a way to detect and redirect rows that contain these "special" characters?

  • I'm looking for a way to detect and redirect rows that contain special characters

  • soorya,

    This is what I came up with.

    in the data flow task add a derived boolean column to act as an flag indicating if a special char exists.

    If you have multiple column that you need to check you will need an indicator for each.

    next setup a script component and set the derived "flag" fields to read/write.

    In my example the field I am testing is "FldWithSpecChar" and the derived column I set up for the flag is "IsSpecChar" as a boolean type.

    Here is the script component:

    'set up char array and char item

    Dim myCharArray() As Char

    Dim myChar As Char

    'populate the char array

    If Not String.IsNullOrEmpty(Row.FldWithSpecChar) Then

    myCharArray = Row.FldWithSpecChar.ToCharArray

    'iterate the array

    For Each myChar In myCharArray

    If Char.IsPunctuation(myChar) Or Char.IsSymbol(myChar) Then

    'if special char then set the read/write field that was created in the derived column

    Row.IsSpecChar = True

    'if is special char exit loop and end processing of row

    Exit For

    End If

    Next

    End If

    End Sub

    Next use a conditional split set up an output named HasSpecialChar or something like that.

    Here is the formula I used for that output:

    IsSpecChar == TRUE

    Then add the Logging destination connect the data flow and when prompted select the HasSpecialChar output or whatever you named it.

    Then connect the components default output to wherever you need.

    The code will trap the following characters (true):

    1 False

    2 False

    3 False

    a False

    b False

    c False

    A False

    B False

    C False

    ! True

    @ True

    # True

    $ True

    % True

    ^ True

    & True

    * True

    ( True

    ) True

    _ True

    + True

    | True

    - True

    = True

    ` True

    ~ True

    } True

    { True

    [ True

    ] True

    \ True

    ; True

    ' True

    " True

    : True

    , True

    . True

    / True

    ? True

    > True

    < True

    I ran 64,000 records through the script and conditional split took a few seconds.

    Null or empty strings will not be checked in the above script.

    Good Luck!

  • Also here are some SQL test scripts I used thought I would include in case you want to test.

    use Testing

    go

    -- create / reset receiving tables

    if exists (select name from sys.tables where name = 'GoodData')

    begin

    truncate table dbo.GoodData

    end

    else

    begin

    CREATE TABLE [dbo].[GoodData](

    [GoodDataKey] [int] NOT NULL,

    [GoodData] [varchar](50) NULL

    ) ON [PRIMARY]

    end

    go

    if exists (select name from sys.tables where name = 'BadData')

    begin

    truncate table dbo.BadData

    end

    else

    begin

    CREATE TABLE [dbo].[BadData](

    [BadDataKey] [int] NOT NULL,

    [BadData] [varchar](50) NULL

    ) ON [PRIMARY]

    end

    go

    --Source Table

    use testing

    -- create and populate test table

    if exists (select name from sys.tables where name = 'TESTSpecChar')

    begin

    truncate table dbo.TESTSpecChar

    end

    else

    begin

    CREATE TABLE [dbo].[TESTSpecChar](

    [myKey] [int] IDENTITY(1,1) NOT NULL,

    [FldWithSpecChar] [varchar](50) NULL

    ) ON [PRIMARY]

    end

    go

    declare @specchar varchar(50),

    @i int,

    @i2 int

    set @specchar = '`~!@#$%^&*()_+-=[]\{}|;'':",./<>?'

    set @i = 1

    set @i2 = 1

    -- to test a lot of data uncomment the outer while

    --while @i2 < 1000

    --begin

    set @i = 1

    -- check null data

    insert into dbo.TESTSpecChar select Null

    while @i < len(@specchar)+1

    begin

    insert into dbo.TESTSpecChar select 'ABC123abc' + substring(@specchar,@i,1) + 'ABC123abc'

    insert into dbo.TESTSpecChar select 'ABC123abc'

    set @i = @i + 1

    end

    --set @i2 = @i2 + 1

    --end

  • You don't need explicit loops to do this... just the correct type of WHERE clause. This will find all rows in SomeTable that have special characters in SomeCol...

    SELECT *

    FROM dbo.SomeTable

    WHERE SomeCol NOT LIKE '%[^0-9A-Z]%'

    If you want it to be case sensitive, you should probably use a binary collation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I agree but if you didn't use SQL what method would you use in SSIS?

  • Heh... my problem is that I don't use things like SSIS or DTS because I do it all in T-SQL. My appologies.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Problem solved. The posts were really helpful. Thanks to all those who replied

  • Soorya (9/22/2008)


    Problem solved. The posts were really helpful. Thanks to all those who replied

    Heh... good for you, but two way street here... please tell us how you solved the problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry about the delay...

    I used a script component which got the particular field which has to be checked for the existence of special characters as input and also added another new boolean field to say whetehr the other field has special characters or not. In the script, i check each character for any special characters and set the boolean field if exists. after the script component i segregated the records with invalid characters using a conditional split.

  • Thanks... any chance of you posting that script?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Its just a simple VB script...

    Dim myCharArray() As Char

    Dim myChar As Char

    'populate the char array

    If Not String.IsNullOrEmpty(Row.Barcode) Then

    myCharArray = Row.Barcode.ToCharArray

    'iterate the array

    For Each myChar In myCharArray

    If Char.IsPunctuation(myChar) Or Char.IsSymbol(myChar) Then

    'if special char then set the read/write field that was created in the derived column

    Row.HasSpecialChar = True

    'if is special char exit loop and end processing of row

    Exit For

    End If

    Next

    End If

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply