SSIS issue

  • Hello Everyone,

    I am facing a small problem while creating the SSIS package.

    I have a requirement -

    The user needs to upload a file in one of the tables in database. It has 3 parts -

    - new rows should be added in table

    - existing rows should be modified (if required)

    - if rows are not present in file but are present in table then those need to be deleted from table.

    First two requirements have been capruted by creating the task for adding new rows and updating existing rows in the table using the file

    However, i am not understanding how to cater for 3rd requirement. I thought of using rowCount but how can I get the data from two different sources for comparison and then delete the unwanted rows from table .

    Any guidence and sugesstion will be appreciated....

    Thanks

    Mithun

  • I would import the file intro an empty staging table. It would then be a 3 step operation

    1. Update existing rows with an inner join between destination & staging

    2. Insert new rows with an outer join between destination and staging

    3. delete un-needed rows with an outer join the other way round from step 2

    Create Table #destination (myKey int Primary key, myVal1 int, myVal2 int)

    go

    Create Table #staging(myKey int Primary key, myVal1 int, myVal2 int)

    go

    insert into #destination

    select 1,2,2

    union all

    select 2,1,1

    union all

    select 3,1,1

    union all

    select 4,1,1

    union all

    select 5,1,1

    union all

    select 6,1,1

    insert into #staging

    select 1,2,3

    union all

    select 2,1,2

    union all

    select 5,1,1

    union all

    select 6,1,1

    union all

    select 7,5,1

    union all

    select 8,5,1

    select * from #destination

    ----------------------------------------------------------

    --solution

    --updates

    update d

    set d.myVal1 = s.myVal1, d.myVal2 = s.myVal2

    -- select d.*, s.*

    from

    #destination d

    inner join #staging s

    on d.myKey = s.myKey

    --inserts

    insert #destination

    select s.* from #staging s

    left outer join #destination d

    on s.myKey = d.myKey

    where d.myKey is null

    --deletes

    delete d

    -- select d.*

    from

    #staging s

    right outer join #destination d

    on s.myKey = d.myKey

    where s.mykey is null

    select * from #destination

    drop table #destination

    drop table #staging

    run the inserts and run the (commneted out) selects to see what is going on.

    HTH

    Dave J


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • thanks for ur replay,,

    but i m not sure how to implement the logic given by u in ssis packages, i mean what transformations do have to use and all... if you can guide me thru it would be great help.... as i m new to ssis

    thanks for ur guidence

    Mithun

  • I have to admit I do not know SSIS at all! But I do know DTS, and I'm sure the equivalent steps I'm about to outline are there.

    A. Turn the example code into a stored procedure. Add error checking etc. as you do so. Make sure the stored proc truncates the staging table on successful completion.

    B. In an SSIS package, step 1 is to import the file into the staging table

    C. Step 2 in the SSIS package is to call the Stored proc. This was called 'Execute SQL Task' or similar in DTS.

    D. You should be good.

    The key to it is to understand what the code you are running in the second SSIS step (the new SP) does.

    HTH

    Dave J

    Edit: Edited last sentence for clarity


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • I agree with Dave J. He has the right approcah. The transformation task are not required. Step back from SSIS and think of what you want to do in sql.The task in SSIS is the execute sql task.

    1. load the file into a staging table.

    2. You are now looking for records that are in your table but not in the staging table (created from file).

    As I dont know mmuch about your data so this is just an example.

    id_column is my primatry key its an integer. I will create a temp table with the primary keys of the missing records then I will delete them.

    create table #temp_id_column(id_column as int)

    insert into #temp_id_column

    select id_column from your_table

    except

    select id_column from staging_table

    delete from your_table where id_column = (select id_column from #temp_id_column)

    put it all in as one execute sql statement. If you split it then you will need to ensure the connection property "RetainSameConnection" is set to true.

    Hope that helps.

    Dont understand what the difference is between what you are doing and wioing the old data and loading the new in?

    HTH I genrally waffle and go at a tangent so please ask if you need it explaining a bit more clearly.

    Ells

  • mithun.gite (6/11/2009)


    Hello Everyone,

    I am facing a small problem while creating the SSIS package.

    I have a requirement -

    The user needs to upload a file in one of the tables in database. It has 3 parts -

    - new rows should be added in table

    - existing rows should be modified (if required)

    - if rows are not present in file but are present in table then those need to be deleted from table.

    First two requirements have been capruted by creating the task for adding new rows and updating existing rows in the table using the file

    However, i am not understanding how to cater for 3rd requirement. I thought of using rowCount but how can I get the data from two different sources for comparison and then delete the unwanted rows from table .

    Any guidence and sugesstion will be appreciated....

    Thanks

    Mithun

    Hi,

    Let me give a shot !

    - new rows should be added in table

    you can do this using a lookup task in SSIS

    - existing rows should be modified (if required)

    - if rows are not present in file but are present in table then those need to be deleted from table.

    I do not understand this ! If the file is loaded every time, then why would you delete the old data in the table that is not in the file? Rather it should add the new data to the table and not delete the previous data ?

    Provide some more info...

    Here is what I am doing in my environment:

    I have a flat file generated by business objects every night. I have created an SSIS package that picks up the file from the FTP server, loads the data into a staging table, compares the data of staging to the base table and if there is any new row, it writes the data to the base table. Finally, it clears the staging table data.

    Hope this helps !!

    Thanks,

    \\K

    ______________________________________________________________________________________________________________________________________________________________________________________
    HTH !
    Kin
    MCTS : 2005, 2008
    Active SQL Server Community Contributor 🙂

  • please follow these bellow steps to Achieve your requirement:

    Take a flatfile source for flatfile and OLEDB source for target table

    Take a Merge join and make a full outer join between source and target

    Take a Derived Column and add a flag column to Identify the row behavior

    Based on the Merge join results write conditional expressions

    if value existed in flatfile but null existed in target then mark that row with "I" flag

    if value existed in flatfile,target table then check for any column value is changed if changed mark that row as "U"

    if null existed in flat file but value existed in target then mark that row with "D" flag

    Take a Conditional split and Route the data into three groups I,U,D

    Take OLEDB command pass U group rows for update the rows in the target

    Take OLEDB command pass D group rows for Delete the rows from target table

    Take OLEDB Destination and pass I group rows to insert into target table.

    Thanks

    Lakshman

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply