Loading CSV file into Multiple tables

  • Hi,
    I have to load the data from .csv file into few sql server tables which have referential integrity also.
    What's the best way to load data without any referential integrity or other issues.
    Is it through SSMS import task easy or scripting way?

    Table 1 ==> PK col + some columns 
    Table 2 ==> PK+ some columns + FK to TABLE1
    Table 3 ==> PK+ some columns
    Table 4 ==> PK + some columns + FK to TABLE2
    Table 5 ==> PK + some columns + FK to TABLE2 and TABLE3
    Table 6 ==> Kind of MAster Table which contains all other tables PK columns as FK + some other columns
    Note: Some of other columns i have default value also like User (SUSER) and DAteCol is GETDATE()
    Your input greatly appreciated!

    Thanks

  • script out all the constraints,
    drop them all,
    import data,
    re-add the constraints by running script from step 1

  • Thanks for your reply.
    I think Import wizard only allow to load into single table, mot multiple table, right?
    I already loaded into Temp Staging table but i was also looking Insert script example to load into multiple tables from the staging table.

  • If you don't want to drop and recreate constraints, load the data into a staging table, then load the data into the main tables in the proper order to prevent PK/FK issues.  That also will let you check for any issues before trying to recreate a constraint and it failing.

  • poratips - Wednesday, March 13, 2019 8:24 PM

    Hi,
    I have to load the data from .csv file into few sql server tables which have referential integrity also.
    What's the best way to load data without any referential integrity or other issues.
    Is it through SSMS import task easy or scripting way?

    Table 1 ==> PK col + some columns 
    Table 2 ==> PK+ some columns + FK to TABLE1
    Table 3 ==> PK+ some columns
    Table 4 ==> PK + some columns + FK to TABLE2
    Table 5 ==> PK + some columns + FK to TABLE2 and TABLE3
    Table 6 ==> Kind of MAster Table which contains all other tables PK columns as FK + some other columns
    Note: Some of other columns i have default value also like User (SUSER) and DAteCol is GETDATE()
    Your input greatly appreciated!

    Thanks

    WHY on this good Green Earth would you try to avoid referential integrity?  It's there to protect the table from bad data!

    The best way to do this is to load the data into a staging table that has no DRI and figure out what goes to which table from there.

    I'll also state that even if you're loading to just one table, you should still use a staging table so that you can do validations, etc, and maybe even mark each imported row with the reason as to what's going to fail instead of just letting in fail.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Thursday, March 14, 2019 9:07 AM

    poratips - Wednesday, March 13, 2019 8:24 PM

    Hi,
    I have to load the data from .csv file into few sql server tables which have referential integrity also.
    What's the best way to load data without any referential integrity or other issues.
    Is it through SSMS import task easy or scripting way?

    Table 1 ==> PK col + some columns 
    Table 2 ==> PK+ some columns + FK to TABLE1
    Table 3 ==> PK+ some columns
    Table 4 ==> PK + some columns + FK to TABLE2
    Table 5 ==> PK + some columns + FK to TABLE2 and TABLE3
    Table 6 ==> Kind of MAster Table which contains all other tables PK columns as FK + some other columns
    Note: Some of other columns i have default value also like User (SUSER) and DAteCol is GETDATE()
    Your input greatly appreciated!

    Thanks

    WHY on this good Green Earth would you try to avoid referential integrity?  It's there to protect the table from bad data!

    The best way to do this is to load the data into a staging table that has no DRI and figure out what goes to which table from there.

    I'll also state that even if you're loading to just one table, you should still use a staging table so that you can do validations, etc, and maybe even mark each imported row with the reason as to what's going to fail instead of just letting in fail.

    This is important enough to repeat, and although (importing to a staging table) it might seem like extra work or an extra step, experience will soon tell you it is not.
    You could spend hours attempting to fix DRI errors if you were to simply throw the data into the tables after switching off the constraints.
    Start smart, end on a good note.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I have  created Staging table but when loading from Staging table to my regular tables and inserting FK values from parent table, FK field not loading all the PK Field values only last value picking up and repeating for all 800 records with same record.
    I am stuck right now. 
    DECLARE @DCID INT, -- Parent Key Column with Identity in DC Table
       @FCID INT , -- Parent Key column with Identity in FC Table
    @DNCID INT -- Parent Key Column with Identity in DNC Table
    -- @PPIDINT
    -- @PSID INT

    BEGIN TRAN

    INSERT INTO [dbo].[DC] ( [DCName])

    SELECT [DC_Cat] 
    FROM [dbo].[DM]

    SET @DCID =
    SCOPE_IDENTITY() -- @@IDENTITY
       SET @FCID =
    SCOPE_IDENTITY() -- @@IDENTITY
    SET @DNCID =
    SCOPE_IDENTITY()
    SET @PPID = SCOPE_IDENTITY()
    SET @PSID = SCOPE_IDENTITY()

    INSERT INTO [dbo].[FC] ([FCName], [GWName], [Ins])
    SELECT FC_Name, G_Name4_W , Ins
    FROM [dbo].[DM]

    INSERT INTO [dbo].[DNC] (DCID, FCID, DNameCat, DWeb) -- DCID is Referencing to Dc table and FCID Ref to FC Table
    SELECT @DCID, @FCID, DNameCat, DWeb
    FROM [dbo].[DM]

    COMM TRAN

    -- when Laoding into only tables DC and FC, its runs fine as no dependent parent Table for both
    But when Loading all three tables together DC, FC and DNC then loads but both DCID and FCID not incrementing and just pickup only last values
    Ex.
    DC Table:
    --------
    1
    2
    3

    FC Table
    -------
    1
    2
    3

    DNC table
    ---------
    1 3 3
    2 3 3
    3 3 3

    It shoud pick up 1 and 2 from DC and FC table but not.

  • This is really the bases of ETL processing.
    Following is a standard pattern to load data from staging into one or more destination tables

    1 - Load into the tables that do not have any FK - these are parent tables.
      the resulting operation could potentially be one of the following depending on business process
      Update - exists on both tables but contents are different
      Insert - does not exist on destination table
      Delete - it is on the staging table but not on destination table
      The above obviously implies that there can be a lookup between staging table and destination table

    2 - load into child tables
      again it could be one of the above possibly outcomes
      process for each child table is always similar
      2.1 - join to parent tables to retrieve the Parent ID values
        Determine the operation to do

    order of operations should always be update, insert then delete
    or alternatively use a Merge statement to do all 3 in one go

    Sample example
    -- insert new records onto FCAT
    insert into FCat
      (FCatName
     , GWName
     , INS
      )
    select st1.F_Cat
      , st1.DWName
      , st1.INS

    from StagingTbl st1
    left outer join FCat fc1 -- join to retrieve the FK ID value for DCatID
    on fc1.FCatName = st1.F_Cat -- or is it d_cat
    where fc1.FCatID is null -- only select records not on destination table

    -- example using output clause
    if object_id('tempdb..#fcat_ids') is not null
    drop table #fcat_ids
    create table #fcat_ids
    (FCatID int not null
    , FCatName nvarchar(60) not null
    )

    -- insert new records onto FCAT
    insert into FCat
      (FCatName
     , GWName
     , INS
      )
      output inserted.FCatID
       , inserted.FCatName
      into #fcat_ids

    select st1.D_Name_Cat
      , st1.DWName
      , st1.INS

    from StagingTbl st1
    left outer join FCat fc1 -- join to retrieve the FK ID value for DCatID
    on fc1.FCatName = st1.F_Cat
    where fc1.FCatID is null -- only select records not on destination table
    /*
    At this point table #fcat_ids contains the columns required for child tables to link from staging table and retrieve the corresponding ID value
    */

    /*
    perform remaining operations for parent tables
    */
    insert into DNameCat
      (DCatID
     , FCatID
     , DWName
      )
    select dc1.DCatID
      , fc1.FCatID
      , st1.DWName
    from StagingTbl st1
    inner join DCat dc1 -- join to retrieve the FK ID value for DCatID
    on dc1.DCatName = st1.D_Name_Cat -- or is it d_cat

    inner join FCat fc1 -- join to retrieve the FK ID value for FCatID
    on fc1.FCatName = st1.F_Cat -- or is it d_cat

  • Thanks SSChampion detailed steps, i will try to implement it,

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply