Inserting missing records using multiple fields as the key

  • Hi,

    I have a complex problem that I'm pretty sure should have a very simple solution. If I could just find it.

    I am trying to import data from a flat file aka Raw_Build_List:

    Contract Description varchar(30)

    Section Description varchar(25)

    Item Description varchar(30)

    Quantity integer

    I am trying to write a script that will pull it into a relational database to use with other data. The script will have to run every night. The Raw data may contain records processed before.

    I have 4 tables:

    Contracts (ID, Contract_Description, etc),

    Sections (ID, Section, etc),

    Items (ID, Description, Category, etc) and

    Build_list (ContractID, SectionID, ItemID, Qty, StartDate, etc)

    To import the data I do multiple steps.

    1. Add 3 extra fields to Raw_Build_List: ContractID, SectionID and ItemID.

    2. I run a conditional insert into each table (Contracts, Sections and Items) using a group by eg.

    INSERT INTO Contracts (Contract_Description)

    SELECT ContractDesctiption

    GROUP BY ContractDesctiption

    WHERE ContractDescription NOT IN (SELECT Contract_Description FROM Contracts)

    3. Then using the IDs generated by the Identity Fields I update the new fields in the Raw_Build_List table. eg.

    UPDATE Raw_Build_List

    SET ContractID = ID

    FROM Contracts LEFT OUTER JOIN Raw_Build_List

    ON Contracts.Contract_Description = Raw_Build_List.ContractDesctiption

    4. Now in the Raw_Build_List I have all the key fields against all the correct descriptions and I want to put all those keys into the Build_list table and not create duplicates.

    Is there a non-RBAR way of doing this?

    The one way I thought may work is to string all the IDs together and compare them to the string of all the IDs in the other table and it does work but as expected is really slow. RIGHT('00000000'+LTRIM(RTRIM(ContractID))+RIGHT('00000000'+LTRIM(RTRIM(SectionID))+RIGHT('00000000'+LTRIM(RTRIM(ItemID))

    NOTE: I had to pad it as Contract=1, Section=24 and Item=45 (joined = 12445) will match up with Contract=12, Section=44 and Item 5 (joined = 12445).

    There has to be a better way of doing this...:blink:

  • Does this help?

    -- set up some sample data

    DROP TABLE #Contracts

    CREATE TABLE #Contracts (ID INT IDENTITY, Contract_Description VARCHAR(50), StartDate DATETIME)

    DROP TABLE #Sections

    CREATE TABLE #Sections (ID INT IDENTITY, Section VARCHAR(50))

    DROP TABLE #Items

    CREATE TABLE #Items (ID INT IDENTITY, [Description] VARCHAR(50), Qty INT)

    DROP TABLE #Build_list

    CREATE TABLE #Build_list (

    ContractID INT,

    SectionID INT,

    ItemID INT,

    Contract_Description varchar(30),

    Section_Description varchar(25),

    Item_Description varchar(30),

    Qty INT,

    StartDate DATETIME)

    INSERT INTO #Build_list (Contract_Description, Section_Description, Item_Description, Qty, StartDate)

    SELECT 'Contract 1', 'Section 1', 'Item 1', 10, GETDATE()-22 UNION ALL

    SELECT 'Contract 2', 'Section 2', 'Item 2', 20, GETDATE()-21 UNION ALL

    SELECT 'Contract 3', 'Section 3', 'Item 3', 30, GETDATE()-20 UNION ALL

    SELECT 'Contract 3', 'Section 3', 'Item 3', 40, GETDATE()-19 UNION ALL

    SELECT 'Contract 4', 'Section 4', 'Item 4', 50, GETDATE()-18 UNION ALL

    SELECT 'Contract 5', 'Section 5', 'Item 5', 60, GETDATE()-17

    -- create a table to receive OUTPUT rows

    DROP TABLE #InsertedContracts

    CREATE TABLE #InsertedContracts (ID INT, Contract_Description VARCHAR(50), StartDate DATETIME)

    -- insert distinct new values from source table into contract table,

    -- output new ID's etc into temp table

    -- Contract_Description, StartDate are keys for markback

    INSERT INTO #Contracts (Contract_Description, StartDate)

    OUTPUT inserted.ID, inserted.Contract_Description, inserted.StartDate INTO #InsertedContracts

    SELECT

    Contract_Description,

    StartDate

    FROM (

    SELECT

    Contract_Description,

    StartDate,

    rn = ROW_NUMBER() OVER(PARTITION BY Contract_Description ORDER BY StartDate)

    FROM #Build_list b

    WHERE NOT EXISTS(SELECT 1 FROM #Contracts c WHERE c.Contract_Description = b.Contract_Description)

    ) s

    WHERE rn = 1

    -- markback source table

    UPDATE b SET ContractID = c.ID

    FROM #Build_list b

    INNER JOIN #InsertedContracts c

    ON c.Contract_Description = b.Contract_Description AND c.StartDate = b.StartDate

    -- check results

    SELECT * FROM #Build_list

    SELECT * FROM #Contracts

    /* a future release of SQL Server will allow you to do this in one step using composable dml:

    UPDATE b SET ContractID = c.ID

    FROM #Build_list b

    INNER JOIN (

    INSERT INTO #Contracts (Contract_Description, StartDate)

    OUTPUT inserted.ID, inserted.Contract_Description, inserted.StartDate INTO #InsertedContracts

    SELECT

    Contract_Description,

    StartDate

    FROM (

    SELECT

    Contract_Description,

    StartDate,

    rn = ROW_NUMBER() OVER(PARTITION BY Contract_Description ORDER BY StartDate)

    FROM #Build_list b

    WHERE NOT EXISTS(SELECT 1 FROM #Contracts c WHERE c.Contract_Description = b.Contract_Description)

    ) s

    WHERE rn = 1

    ) c (ID, Contract_Description, StartDate)

    ON c.Contract_Description = b.Contract_Description AND c.StartDate = b.StartDate

    */

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Not really let me put a full copy of the example:

    --Okay Slightly different sample data

    DROP TABLE Contracts

    CREATE TABLE Contracts (ID INT IDENTITY, Contract_Description VARCHAR(50), StartDate DATETIME)

    DROP TABLE Sections

    CREATE TABLE Sections (ID INT IDENTITY, Section VARCHAR(50))

    DROP TABLE Items

    CREATE TABLE Items (ID INT IDENTITY, [Description] VARCHAR(50), Category VARCHAR(15))

    DROP TABLE Build_list

    CREATE TABLE Build_list (

    ContractID INT,

    SectionID INT,

    ItemID INT,

    Qty INT,

    FlagCollect BIT)

    DROP TABLE Raw_Build_list

    CREATE TABLE Raw_Build_list (

    Contract_Description varchar(30),

    Section_Description varchar(25),

    Item_Description varchar(30),

    Qty INT)

    INSERT INTO Raw_Build_list (Contract_Description, Section_Description, Item_Description, Qty)

    SELECT 'Build Car 1', 'Trunk', 'Trunk Latch', 2

    UNION ALL SELECT 'Build Car 1', 'Trunk', 'Cover', 1

    UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Front Door', 1

    UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Indicator', 2

    UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Handle', 2

    UNION ALL SELECT 'Build Car 1', 'Passenger Side', 'Front Door', 1

    UNION ALL SELECT 'Build Car 2', 'Trunk', 'Latch', 2

    UNION ALL SELECT 'Build Car 2', 'Trunk', 'Cover', 1

    UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Front Door', 1

    UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Back Slide Door', 1

    UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Indicator', 2

    UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Handle', 2

    UNION ALL SELECT 'Build Car 2', 'Passenger Side', 'Front Door',1

    INSERT INTO Contracts (Contract_Description, StartDate)

    SELECT 'Build Car 1', GETDATE()- 12

    INSERT INTO Sections (Section)

    SELECT 'Trunk' UNION ALL

    SELECT 'Drivers Side' UNION ALL

    SELECT 'Passenger Side'

    INSERT INTO Items (Description, Category)

    SELECT 'Front Door', 'BODY WORK' UNION ALL

    SELECT 'Trunk Latch', 'SMALLS' UNION ALL

    SELECT 'Cover', 'BODY WORK' UNION ALL

    SELECT 'Indicator', 'LIGHTS' UNION ALL

    SELECT 'Handle', 'SMALLS'

    INSERT INTO Build_list (ContractID, SectionID, ItemID, Qty)

    SELECT 1, 1, 2, 2

    UNION ALL SELECT 1, 1, 3, 1

    UNION ALL SELECT 1, 2, 1, 1

    UNION ALL SELECT 1, 2, 4, 2

    UNION ALL SELECT 1, 2, 5, 2

    --SELECT * FROM Contracts

    --SELECT * FROM Sections

    --SELECT * FROM Items

    --SELECT * FROM Build_list

    --SELECT * FROM Raw_Build_list

    --HERE IS WHERE THE WORK STARTS

    ALTER TABLE Raw_Build_List ADD ContractID bigint

    ALTER TABLE Raw_Build_List ADD SectionID bigint

    ALTER TABLE Raw_Build_List ADD ItemID bigint

    --INSERT ALL THE CONTRACTS

    INSERT INTO Contracts (Contract_Description)

    SELECT Contract_Description

    FROM Raw_Build_list

    GROUP BY Contract_Description

    HAVING (NOT (Contract_Description IN

    (SELECT Contract_Description

    FROM Contracts)))

    --UPDATE ID FIELD FOR CONTRACTS

    UPDATE Raw_Build_list

    SET ContractID = Contracts.ID

    FROM Contracts LEFT OUTER JOIN

    Raw_Build_list ON Contracts.Contract_Description = Raw_Build_list.Contract_Description

    --INSERT ALL THE SECTIONS

    INSERT INTO Sections (Section)

    SELECT Section_Description

    FROM Raw_Build_list

    GROUP BY Section_Description

    HAVING (NOT (Section_Description IN

    (SELECT Section

    FROM Sections)))

    --UPDATE ID FIELD FOR SECTIONS

    UPDATE Raw_Build_list

    SET SectionID = Sections.ID

    FROM Sections LEFT OUTER JOIN

    Raw_Build_list ON Sections.Section = Raw_Build_list.Section_Description

    --INSERT ALL THE ITEMS

    INSERT INTO Items (Description)

    SELECT Item_Description

    FROM Raw_Build_list

    GROUP BY Item_Description

    HAVING (NOT (Item_Description IN

    (SELECT Description

    FROM Items)))

    --UPDATE ID FIELD FOR ITEMS

    UPDATE Raw_Build_list

    SET ItemID = Items.ID

    FROM Items LEFT OUTER JOIN

    Raw_Build_list ON Items.Description = Raw_Build_list.Item_Description

    --NOW THIS WORKS BUT IS VERY, VERY SLOW __THERE HAS TO BE A BETTER WAY TO DO THIS

    INSERT INTO Build_list

    (ContractID, SectionID, ItemID, Qty)

    SELECT ContractID, SectionID, ItemID, Qty

    FROM Raw_Build_list

    WHERE ((RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)) NOT IN

    (SELECT RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)

    AS Joinid

    FROM Build_list))

    --The results of these two queries should be identical

    SELECT Contract_Description, Section_Description, Item_Description, Qty

    FROM Raw_Build_list

    SELECT Contracts.Contract_Description, Sections.Section, Items.Description, Build_list.Qty

    FROM Build_list LEFT OUTER JOIN

    Items ON Build_list.ItemID = Items.ID LEFT OUTER JOIN

    Sections ON Build_list.SectionID = Sections.ID LEFT OUTER JOIN

    Contracts ON Build_list.ContractID = Contracts.ID

  • Try changing

    INSERT INTO Build_list

    (ContractID, SectionID, ItemID, Qty)

    SELECT ContractID, SectionID, ItemID, Qty

    FROM Raw_Build_list

    WHERE ((RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)) NOT IN

    (SELECT RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)

    AS Joinid

    FROM Build_list))

    to

    INSERT INTO Build_list

    (ContractID, SectionID, ItemID, Qty)

    SELECT ContractID, SectionID, ItemID, Qty

    FROM Raw_Build_list

    EXCEPT

    SELECT ContractID, SectionID, ItemID, Qty

    FROM Build_list

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Thanks Mark that is exactly what I'm looking for!

    Simple and fast!

    ๐Ÿ˜€

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply