SSIS and Multiple values in columns problem

  • I have a staging table which load data into another table but Iā€˜m having problems adding the business logic in SSIS dervied column because three of the columns can have up to 3 values in them which would need to split into different rows in the results tables. See example below ā€“

    CREATE TABLE [dbo].[stage](

    [participant_key] [int] NULL,

    [participant_key2] [int] NULL,

    [participant_1] [varchar](40) NULL,

    [participant_rate] [decimal](6, 3) NULL,

    [participant_rate_high] [varchar](50) NULL,

    [participant_2] [varchar](40) NULL,

    [participant_2_rate] [decimal](6, 3) NULL,

    [participant_2_high] [varchar](50) NULL,

    [participant_3] [varchar](40) NULL,

    [participant_3_rate] [decimal](6, 3) NULL,

    [participant_3_high] [varchar](50) NULL,

    [source] [varchar](5) NOT NULL,

    [date] [varchar](50) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    INSERT [dbo].[stage] ([participant_key], [participant_key2], [participant_1], [participant_rate], [participant_rate_high], [participant_2], [participant_2_rate], [participant_2_high], [participant_3], [participant_3_rate], [participant_3_high], [source], [date]) VALUES (182, 7044, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, N'DatA1', N'20140625')

    INSERT [dbo].[stage] ([participant_key], [participant_key2], [participant_1], [participant_rate], [participant_rate_high], [participant_2], [participant_2_rate], [participant_2_high], [participant_3], [participant_3_rate], [participant_3_high], [source], [date]) VALUES (182, 7044, N'razioni', CAST(0.950 AS Decimal(6, 3)), NULL, N'Other', CAST(1.240 AS Decimal(6, 3)), NULL, N'vivi', CAST(1.249 AS Decimal(6, 3)), NULL, N'DatA1', N'20140625')

    For example if Participant 1 has data and the other participant are NULL the row would be -

    participant_position1

    participant_nameparticipant 1

    participant_rateParticipant rate 1

    participant_premiumParticipant_rate_high1

    If you have Participant 1 and Participant 2 also with data and Participant 3 is NULL then there would be two rows instead of one.

    participant_position1, 2

    participant_nameParticipant 1

    Participant 2

    participant_rateParticipant rate 1

    Participant rate 2

    participant_premiumParticipant_rate_high1

    Participant_rate_high2

    see wanted results belows -

    CREATE TABLE [dbo].[Wanted_result](

    [participant_key] [int] NOT NULL,

    [participant_key2] [int] NULL,

    [participant_position] [smallint] NOT NULL, /****** the participant_position is the participant nuber been loaded in that row ******/

    [participant_name] [varchar](40) NOT NULL,

    [participant_rate] [decimal](6, 3) NULL,

    [participant_rate_high] [decimal](19, 4) NULL,

    [source] [varchar](5) NOT NULL,

    [date] [varchar](50) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    INSERT [dbo].[Wanted_result] ([participant_key], [participant_key2], [participant_position], [participant_name], [participant_rate], [participant_rate_high], [source], [date]) VALUES (182, 7044, 1, N'razioni', CAST(0.950 AS Decimal(6, 3)), NULL, N'DatA1', N'20140625')

    INSERT [dbo].[Wanted_result] ([participant_key], [participant_key2], [participant_position], [participant_name], [participant_rate], [participant_rate_high], [source], [date]) VALUES (182, 7044, 2, N'Other', CAST(1.240 AS Decimal(6, 3)), NULL, N'DatA1', N'20140625')

    INSERT [dbo].[Wanted_result] ([participant_key], [participant_key2], [participant_position], [participant_name], [participant_rate], [participant_rate_high], [source], [date]) VALUES (182, 7044, 3, N'vivi', CAST(1.249 AS Decimal(6, 3)), NULL, N'DatA1', N'20140625')

    The participant_position column in the wanted result table is the participant number been loaded in that row, so if Participant 2 was loaded then the value would be 2 for example.

    Thanks for any advice!

  • I really wouldn't do that bit in SSIS.

    Do it in T-SQL instead as a UNION. Something like this:

    select participant_key

    ,participant_key2

    ,participantPosition = 1

    ,participant_1

    ,participant_rate

    ,participant_rate_high

    ,source

    ,date

    from dbo.stage

    union all

    select participant_key

    ,participant_key2

    ,participantPosition = 2

    ,participant_2

    ,participant_2_rate

    ,participant_2_high

    ,source

    ,date

    from dbo.stage

    where participant_2 is not null

    union all

    select participant_key

    ,participant_key2

    ,participantPosition = 3

    ,participant_3

    ,participant_3_rate

    ,participant_3_high

    ,source

    ,date

    from dbo.stage

    where participant_3 is not null

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I agree with Phil. You're trying to normalize data. I'd recommend, for this particular instance, to use SQL. Either with the UNION or with Unpivot.

    If you need to do this during the transformation instead of after drop to a stage, you're looking at using a synchronous script transform component in the data flow. If you've never used one they are confusing at first. Play in a test area until you understand the output controls.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thank you it works perfectly!!

  • Evil Kraig F (7/3/2014)


    I agree with Phil. You're trying to normalize data. I'd recommend, for this particular instance, to use SQL. Either with the UNION or with Unpivot.

    If you need to do this during the transformation instead of after drop to a stage, you're looking at using a synchronous script transform component in the data flow. If you've never used one they are confusing at first. Play in a test area until you understand the output controls.

    All correct, except for the 'synchronous' bit, which should read 'asynchronous' (true where number of rows out <> number of rows in).

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Crap, you're right Phil. Thank you for the correction. Apologies, OP.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply