Slow running proc on SQL 2016 with execution plan

  • The stored proc is running very slow on 2016 as compared to 2019. Its doing some common validation from a temp table and inserting data in primary tables and mapping tables. The mapping tables have clustered indexes on primary keys {Table1Id, Table2Id, Table3Id}. The proc needs to process around 4mill records. It finishes execution in 2 hrs on SQL2019 but takes probably 6-8hrs on 2016 with the same data file. Execution plan is attached. Please help!

    Execution plan can be found here - https://www.brentozar.com/pastetheplan/?id=BybjNm5a8

    • This topic was modified 3 years, 10 months ago by  marcus.
    • This topic was modified 3 years, 10 months ago by  marcus.
    Attachments:
    You must be logged in to view attached files.
  • You're populating these tables

    dbo.Locations

    dbo.Individuals

    dbo.Households

    dbo.ListTypes

    dbo.HouseholdMailings

    from this table: dbo.HouseholdTemp

    You're working on one row at a time in a loop - this is why it's so slow. If you can rewrite it to do one pass of dbo.HouseholdTemp for each INSERT then you might get it down to two minutes from 2 - 8 hours.

     

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the response ChrisM@Work. This query performs way better in SQL2019. Why its not possible to get similar or even close to that response in SQL2016.

    I need to get the identities from Locations and Individuals before inserting in Households and then only insert in ListType if the row does not exists and then finally insert in HouseholdMapping. There are some biz validations going in place so had to process each row. If there are better options please can you show me a code snippet of how that can be accomplished ?

     

     

    • This reply was modified 3 years, 10 months ago by  marcus.
    • This reply was modified 3 years, 10 months ago by  marcus.
  • Sure. I think the logic of this is about right - it's hard to tell because there are so many IF blocks in the code:

    INSERT INTO dbo.Locations (
    LocationGroupId,Address1,Address2,City,State,Zip10,ZipCode,DeliveryPointCode,FIPSCounty,CountyName,
    AddressType,PrimaryNumber,PreDirectional,StreetName,StreetSuffix,PostDirectional,UnitDesignator,
    UnitNumber,PMBNumber,CarrierRoute,PenetrationCode,MetroName,CreatedDate,ModifiedDate)
    -- OUTPUT here to capture LocationID if you wish
    SELECT
    LocationGroupId,Address1,Address2,City,State,Zip10,ZipCode,DeliveryPointCode,FIPSCounty,CountyName,
    AddressType,PrimaryNumber,PreDirectional,StreetName,StreetSuffix,PostDirectional,UnitDesignator,
    UnitNumber,PMBNumber,CarrierRoute,PenetrationCode,MetroName,GETDATE(), GETDATE()
    FROM dbo.HouseholdTemp ht
    WHERE ht.ProcessedDate IS NULL
    AND (
    (locationgroupid IS NULL OR locationgroupid = '')
    OR
    (NOT EXISTS (SELECT 1 FROM dbo.Locations l WHERE l.LocationGroupId = htlocationgroupid))
    )
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for response. So if I understand it correct from the above ....

    one block of insert for locations and another for individuals

    INSERT into dbo.Locations ...SELECT ....FROM dbo.HouseholdTemp WHERE <conditions> -- Output dbo.Locations.Id

    INSERT into dbo.Individuals...SELECT ....FROM dbo.HouseholdTemp WHERE <conditions> -- Output dbo.Individuals.Id

    But then how can I map the Ids of Location to the Individuals following this bulk insert into the dbo.households and dbo.householdmailings table ?

     

  • marcus wrote:

    Thanks for response. So if I understand it correct from the above ....

    one block of insert for locations and another for individuals

    INSERT into dbo.Locations ...SELECT ....FROM dbo.HouseholdTemp WHERE <conditions> -- Output dbo.Locations.Id

    INSERT into dbo.Individuals...SELECT ....FROM dbo.HouseholdTemp WHERE <conditions> -- Output dbo.Individuals.Id

    But then how can I map the Ids of Location to the Individuals following this bulk insert into the dbo.households and dbo.householdmailings table ?

    Use the OUTPUT clause to output the new LocationID and the [existing row ID] from the source table HouseholdTemp into a #temp table. Same with the Individuals table. Then join these back to the source table on [existing row ID] for your inserts into households and householdmailings.

     

     

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • In your block above can I output the [existing row ID] from the source table (HouseholdTemp) without inserting it in the Locations or Individuals table ?

  • ChrisM@Work : took your suggestion and came up with a much better query plan. Execution was too fast within few minutes. I still got to validate the data 🙂 But can you eyeball the query for feedback

    SET NOCOUNT ON
    DECLARE @startptr INT= 0;
    DECLARE @batchsize INT = 100000;
    DECLARE @DataIngestionLogId INT= 0;

    SET @startptr= @startptr + @batchsize
    IF OBJECT_ID('dbo.LocationMap', 'U') IS NOT NULL DROP TABLE dbo.LocationMap;
    IF OBJECT_ID('dbo.IndividualMap', 'U') IS NOT NULL DROP TABLE dbo.IndividualMap;
    CREATE TABLE dbo.LocationMap (Id int not null identity(1,1), HouseholdTempId int NOT NULL, LocationId int NOT NULL)
    CREATE TABLE dbo.IndividualMap (Id int not null identity(1,1), HouseholdTempId int NOT NULL, IndividualId int NOT NULL)

    WHILE EXISTS
    (
    SELECT TOP 1 1
    FROM dbo.HouseholdTemp ht
    WHERE ht.ProcessedDate IS NULL
    --AND @startptr <= 1000000
    )
    BEGIN



    -- Locations
    INSERT INTO dbo.Locations
    (LocationGroupId,Address1, Address2, City, State, Zip10, ZipCode, DeliveryPointCode, FIPSCounty, CountyName, AddressType, PrimaryNumber, PreDirectional, StreetName, StreetSuffix, PostDirectional, UnitDesignator, UnitNumber, PMBNumber, CarrierRoute, PenetrationCode, MetroName, CreatedDate, ModifiedDate, HouseholdTempId)
    OUTPUT INSERTED.HouseholdTempId, INSERTED.Id INTO dbo.LocationMap(HouseholdTempId,LocationId)
    SELECT TOP @batchsize
    ht.LocationGroupId, ht.Address1, ht.Address2, ht.City, ht.State, ht.Zip10, ht.ZipCode, ht.DeliveryPointCode, ht.FIPSCounty, ht.CountyName, ht.AddressType, ht.PrimaryNumber, ht.PreDirectional, ht.StreetName, ht.StreetSuffix, ht.PostDirectional, ht.UnitDesignator, ht.UnitNumber, ht.PMBNumber, ht.CarrierRoute, ht.PenetrationCode, ht.MetroName, getdate(), getdate(), ht.Id
    FROM dbo.HouseholdTemp ht WHERE ht.ProcessedDate IS NULL --AND ht.Id <= @startptr

    -- Individuals
    INSERT INTO dbo.Individuals
    (IndividualGroupId, FirstName, LastName, DateOfBirth, EmailAddress, Gender, MaritalStatus, CreatedDate, ModifiedDate, HouseholdTempId)
    OUTPUT INSERTED.HouseholdTempId, INSERTED.Id INTO IndividualMap(HouseholdTempId,IndividualId)
    SELECT TOP @batchsize
    ht.IndividualGroupId, ht.FirstName, ht.LastName, CAST(ht.DateOfBirth AS DATE), ht.EmailAddress, ht.Gender, ht.MaritalStatus, GETDATE(), GETDATE(), ht.Id
    FROM dbo.HouseholdTemp ht WHERE ht.ProcessedDate IS NULL --AND ht.Id <= @startptr

    -- Households
    INSERT INTO dbo.Households (LocationId, IndividualId, CreatedDate, ModifiedDate)
    SELECT lm.LocationId, im.IndividualId, getdate(), getdate()
    FROM dbo.LocationMap lm
    INNER JOINdbo.IndividualMap im ON lm.HouseholdTempId= im.HouseholdTempId

    -- HouseholdMailings
    INSERT INTO dbo.HouseholdMailings (MailingId, IndividualId, LocationId, ListTypeId, FinderNumber, ModelRank, CreatedDate, ModifiedDate)
    SELECT m.Id AS MailingId, im.IndividualId, lm.LocationId, lt.Id AS ListTypeId, ht.FinderNumber, ht.ModelRank, getdate(), getdate()
    FROM dbo.LocationMap lm
    INNER JOIN dbo.HouseholdTemp ht ON lm.HouseholdTempId= ht.Id
    INNER JOIN dbo.IndividualMap im ON ht.Id = im.HouseholdTempId
    INNER JOIN dbo.Jobs j ON LTRIM(RTRIM(ht.JobNumber)) = LTRIM(RTRIM(j.Number))
    INNER JOIN dbo.MergedPanels mp ON TRIM(RTRIM(ht.MergedPanelCode)) = LTRIM(RTRIM(mp.Code))
    INNER JOIN dbo.Mailings m ON mp.Id = m.MergedPanelId AND j.Id = m.JobId
    INNER JOIN dbo.ListTypes lt ON LTRIM(RTRIM(ht.ListType)) = LTRIM(RTRIM(lt.Type)) AND LTRIM(RTRIM(ht.ListCode)) = LTRIM(RTRIM(lt.Code))

    UPDATE dbo.HouseholdTemp SET ProcessedDate = getdate() WHERE ProcessedDate IS NULL AND ID <= @startptr

    TRUNCATE TABLE LocationMap
    TRUNCATE TABLE IndividualMap

    SET @startptr = @startptr + @batchsize;

    END

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply