Help needed in Performance Tuning

  • Hi,

    I am trying to build a salable proc which does the computation logic and currently i have 50k user. I am thinking in future perspective and targeting this logic for 500k users.

    Sample table schema and test data

    create table Comp_Detail(IDcompDetail int identity(1,1) primary key,IDcompAFR int,CompID int default 1050,UserId bigint,

    TransferAmount money, processStatus bit default 0,AmtTransferDate datetime);

    --===== Create and populate the balance table on-the-fly

    ;WITH

    cteRowSource1 AS

    (

    SELECT TOP 500000

    N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    )

    INSERT into Comp_Detail(IDcompAFR,CompID,UserId,TransferAmount)

    SELECT 1000, 1050,

    UserId = ISNULL(N,0)

    ,TransferAmount = N*10

    FROM cteRowSource1

    -- select * from Comp_Detail

    --===== Create and populate the balance table on-the-fly

    Create table User_bank(IDUserBank int identity(1,1) primary key, UserId bigint,Amount_Pend money,Amount_Available money,

    LastModDt datetime);

    ;WITH

    cteRowSource AS

    (

    SELECT TOP 500000

    N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    )

    Insert into User_bank(UserId,Amount_Pend,Amount_Available)

    SELECT UserId = ISNULL(N,0)

    ,PendingAmount = N*10

    ,AvailableAmount = N*2

    FROM cteRowSource

    ;-- select * from member_balance;

    update Comp_Detail set IDcompAFR = 1001 where IDcompDetail > 10000 and IDcompDetail < 20000 ;

    update Comp_Detail set IDcompAFR = 1002 where IDcompDetail > 20000 and IDcompDetail < 30000;

    update Comp_Detail set IDcompAFR = 1003 where IDcompDetail > 30000 and IDcompDetail < 40000;

    update Comp_Detail set IDcompAFR = 1004 where IDcompDetail > 40000 and IDcompDetail <50000;

    update Comp_Detail set IDcompAFR = 1005 where IDcompDetail > 50000 and IDcompDetail < 60000;

    My build my logic as below.

    Declare @CompID int = 1050;

    BEGIN

    -- Check if any data available to be processed

    IF EXISTS (

    SELECT TOP 1 IDcompAFR

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    )

    BEGIN

    BEGIN TRY

    -- Set it so if the first UPDATE fails, we won't even start the second update.This really says "If we're in a transaction

    -- and something fails, stop processing the transaction and do a rollback if we can".

    SET XACT_ABORT ON;

    -- temp variable to hold the actual data. this will be used to get IDcompAFR once the balance updated

    DECLARE @ActualData TABLE (

    UserId BIGINT

    ,IDcompAFR BIGINT

    ,ProcessingAmount MONEY

    );

    -- table variable to capture the Affected UserId's

    DECLARE @AffecedRecords TABLE (UserId BIGINT);

    BEGIN TRANSACTION;

    -- Get the whole data to be processed.

    INSERT INTO @ActualData (

    UserId

    ,IDcompAFR

    ,ProcessingAmount

    )

    SELECT UserId

    ,IDcompAFR

    ,ProcessingAmount = COALESCE(TransferAmount, 0)

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    ;

    -- Aggregare the ProcessingAmount based on UserId

    WITH AggregateData

    AS (

    SELECT UserId

    ,ProcessingAmount = SUM(COALESCE(ProcessingAmount, 0))

    FROM @ActualData

    GROUP BY UserId

    )

    --Do the Amount update and capture the UserId that are affected.

    UPDATE UB

    SET UB.Amount_Available = COALESCE(UB.Amount_Available, 0) + AD.ProcessingAmount

    ,UB.Amount_Pend = COALESCE(UB.Amount_Pend, 0) - AD.ProcessingAmount

    ,LastModDt = getdate()

    OUTPUT deleted.UserId

    INTO @AffecedRecords(UserId)

    FROM User_bank UB

    INNER JOIN AggregateData AD ON UB.UserId = AD.UserId;

    --===== Using the captured UserId get the IDcompAFR from @ActualData temp variable

    -- and then update the processStatus = 1

    --- means OFR processed for the trip .

    UPDATE Comp_Detail

    SET processStatus = 1

    ,AmtTransferDate = getdate()

    WHERE IDcompAFR IN (

    SELECT DISTINCT AD.IDcompAFR

    FROM @ActualData AD

    INNER JOIN @AffecedRecords AR ON (AD.UserId = AR.UserId)

    )

    AND processStatus = 0;

    COMMIT TRANSACTION;

    END TRY

    BEGIN CATCH

    DECLARE @ErrorMessage NVARCHAR(4000);

    DECLARE @ErrorSeverity INT;

    DECLARE @ErrorState INT;

    SELECT @ErrorMessage = ERROR_MESSAGE()

    ,@ErrorSeverity = ERROR_SEVERITY()

    ,@ErrorState = ERROR_STATE();

    ROLLBACK TRANSACTION;

    RAISERROR (

    @ErrorMessage

    ,@ErrorSeverity

    ,@ErrorState

    );

    END CATCH;

    END

    END

    GO

    the query logic takes 20 + minutes and it keeps on running. not sure what mistake i did. any suggestion to improve the speed of this query

  • First, thank you for taking the time to build the test data. That will really help.

    It would be helpful if, instead of trying to figure out what the overall goal of your code is, if you gave us an overview of the requirements you're trying to solve.

    You're using Table Variables to handle a couple of half million row tables. That, in itself, will be a performance problem because SQL Server will not create statistics on such things nor can you directly create any useful indexes. Only those indexes automatically created to support unique constraints can be realized.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thank you jeff,

    from your comments, on what column should i need to create index?

    Also, the requirement is, i have a process which runs once in a month and takes records from comp_details table to calculate the amount to be applied on user_bank table for the users and make the ProcessStatus = 1 for the amount applied for the users.

    If temp variable doesn't a fit for this logic then what should i do? It would be great if you could help me changing the logic.

    Even i tried to create Index on

    CREATE NONCLUSTERED INDEX CI_compID ON [dbo].[Comp_Detail] ([CompID]);

    CREATE NONCLUSTERED INDEX CI_processStatus ON [dbo].[Comp_Detail] ([processStatus]);

    for 100k records it took 6 seconds, 200k it took 13 seconds,300k took 29 seconds, 400k 42 seconds and 500k took 54 seconds.

    Is there any other way still i could improve the speed?

  • Something is going on here. Using your test data and your code as you posted it, the code runs in 23 seconds. Is it this test code that you posted that's taking more than 20 minutes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    thanks for your reply.

    I don't know what has happened. Initially it took 23 minutes. but now 500k took 54 seconds. Is thee any way to improve this please.

  • Hi Jeff,

    Quick Update,

    I remove using the temp variable and used temp table instead. It took 31 seconds. Do you have any suggestion to spreed up still?

    my latest try,

    Declare @CompID int = 1050;

    BEGIN

    -- Check if any data available to be processed

    IF EXISTS (

    SELECT TOP 1 IDcompAFR

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    )

    BEGIN

    BEGIN TRY

    -- Set it so if the first UPDATE fails, we won't even start the second update.This really says "If we're in a transaction

    -- and something fails, stop processing the transaction and do a rollback if we can".

    SET XACT_ABORT ON;

    --Create a table to remember the rows we updated.

    IF OBJECT_ID('tempdb..#ActualData') IS NOT NULL

    BEGIN

    DROP TABLE #ActualData;

    END

    IF OBJECT_ID('tempdb..#AffecedRecords') IS NOT NULL

    BEGIN

    DROP TABLE #AffecedRecords;

    END

    CREATE TABLE #ActualData(UserId BIGINT

    ,IDcompAFR BIGINT

    ,ProcessingAmount MONEY);

    Create table #AffecedRecords (UserId BIGINT);

    -- temp variable to hold the actual data. this will be used to get IdcompanyOFR once the balance updated

    --DECLARE @ActualData TABLE (

    --UserId BIGINT

    --,IDcompAFR BIGINT

    --,ProcessingAmount MONEY

    --);

    -- table variable to capture the Affected UserId's

    --DECLARE @AffecedRecords TABLE (UserId BIGINT);

    BEGIN TRANSACTION;

    -- Get the whole data to be processed.

    INSERT INTO #ActualData (

    UserId

    ,IDcompAFR

    ,ProcessingAmount

    )

    SELECT UserId

    ,IDcompAFR

    ,ProcessingAmount = COALESCE(TransferAmount, 0)

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    ;

    -- Aggregare the ProcessingAmount based on UserId

    WITH AggregateData

    AS (

    SELECT UserId

    ,ProcessingAmount = SUM(COALESCE(ProcessingAmount, 0))

    FROM #ActualData

    GROUP BY UserId

    )

    --Do the balance update and capture the UserId that are affected.

    UPDATE UB

    SET UB.Amount_Available = COALESCE(UB.Amount_Available, 0) + AD.ProcessingAmount

    ,UB.Amount_Pend = COALESCE(UB.Amount_Pend, 0) - AD.ProcessingAmount

    ,LastModDt = getdate()

    OUTPUT deleted.UserId

    INTO #AffecedRecords(UserId)

    FROM User_bank UB

    INNER JOIN AggregateData AD ON UB.UserId = AD.UserId;

    --===== Using the captured UserId get the IDcompAFR from @ActualData temp variable

    -- and then update the processStatus = 1

    --- means OFR processed for the trip .

    UPDATE Comp_Detail

    SET processStatus = 1

    ,AmtTransferDate = getdate()

    WHERE IDcompAFR IN (

    SELECT DISTINCT AD.IDcompAFR

    FROM #ActualData AD

    INNER JOIN #AffecedRecords AR ON (AD.UserId = AR.UserId)

    )

    AND processStatus = 0;

    COMMIT TRANSACTION;

    END TRY

    BEGIN CATCH

    DECLARE @ErrorMessage NVARCHAR(4000);

    DECLARE @ErrorSeverity INT;

    DECLARE @ErrorState INT;

    DROP TABLE #ActualData;

    DROP TABLE #AffecedRecords;

    SELECT @ErrorMessage = ERROR_MESSAGE()

    ,@ErrorSeverity = ERROR_SEVERITY()

    ,@ErrorState = ERROR_STATE();

    ROLLBACK TRANSACTION;

    RAISERROR (

    @ErrorMessage

    ,@ErrorSeverity

    ,@ErrorState

    );

    END CATCH;

    END

    END

    GO

    --select * from Comp_Detail

    --select * from User_bank

  • I think this does basically the same thing but runs in about 6 seconds on my SQL 2012 box.

    Declare @CompID int = 1050

    ,@ProcessingDate DATETIME = GETDATE();

    BEGIN

    -- Check if any data available to be processed

    IF EXISTS (

    SELECT 1

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    )

    BEGIN

    BEGIN TRY

    BEGIN TRANSACTION;

    -- Aggregare the ProcessingAmount based on UserId

    WITH AggregateData

    AS (

    SELECT UserId

    ,ProcessingAmount = SUM(COALESCE(TransferAmount, 0))

    FROM Comp_Detail

    WHERE CompID = @CompID

    AND coalesce(processStatus, 0) = 0

    GROUP BY UserId

    )

    --Do the Amount update

    UPDATE UB

    SET UB.Amount_Available = COALESCE(UB.Amount_Available, 0) + AD.ProcessingAmount

    ,UB.Amount_Pend = COALESCE(UB.Amount_Pend, 0) - AD.ProcessingAmount

    ,LastModDt = @ProcessingDate

    FROM User_bank UB

    INNER JOIN AggregateData AD ON UB.UserId = AD.UserId;

    UPDATE a

    SET processStatus = 1

    ,AmtTransferDate = @ProcessingDate

    FROM Comp_Detail a

    JOIN User_bank b ON a.Userid = b.Userid

    WHERE CompID = @CompID AND coalesce(a.processStatus, 0) = 0 AND LastModDT = @ProcessingDate

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    DECLARE @ErrorMessage NVARCHAR(4000);

    DECLARE @ErrorSeverity INT;

    DECLARE @ErrorState INT;

    SELECT @ErrorMessage = ERROR_MESSAGE()

    ,@ErrorSeverity = ERROR_SEVERITY()

    ,@ErrorState = ERROR_STATE();

    ROLLBACK TRANSACTION;

    RAISERROR (

    @ErrorMessage

    ,@ErrorSeverity

    ,@ErrorState

    );

    END CATCH;

    END

    END

    A couple of notes:

    - My logic is based on what I think is going on. Your original loop wouldn't run in a period of time I was willing to wait for to check the final results.

    - This bit in the WHERE clause (both of my queries)

    coalesce(a.processStatus, 0) = 0

    is not SARGable, so that could be improved on by making the processStatus column NOT NULL.

    - Indexing might also help a bit but since 6 seconds is already quite a bit faster than the 24 I'm hearing above, maybe that will take care of it for you. Assuming my logic is correct that is.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • According to the Execution Plan, the following INDEX would help the second query in my batch.

    CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]

    ON [dbo].[User_bank] ([UserId])

    INCLUDE ([IDUserBank],[Amount_Pend],[Amount_Available])

    Edit: But in fact with that INDEX the query ran in 15 seconds.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain,

    Thanks a lot for the reply and shortening the logic. Forgot to mention about Isprocessed is not null column only. So i removed the coalesce.

    But when i execute the script, it took 23 seconds. This might be because of i am running script remotely.I will try this logic on my server box and post the result.

  • Finally Quick update,

    My assumption helped me little.

    I ran dwain's sugestion on my production box and it ran in 10 sec.

    Tried my logic as well and it took 11 sec.

    The reason i got it 28 second is i was running the query using Virtual Path Network. thanks dwain for your help.

    Any more suggestion is also welcome. Would like to leave this thread open as i am hunger to reduce the execution time.

  • KGJ-Dev (1/25/2015)


    Finally Quick update,

    My assumption helped me little.

    I ran dwain's sugestion on my production box and it ran in 10 sec.

    Tried my logic as well and it took 11 sec.

    The reason i got it 28 second is i was running the query using Virtual Path Network. thanks dwain for your help.

    Any more suggestion is also welcome. Would like to leave this thread open as i am hunger to reduce the execution time.

    Glad to hear it helped you out, whatever the reason.

    So are you saying that the work I removed was unnecessary and that my logic was correct?

    BTW. The thing I did with storing off a GETDATE() for reuse is something I do quite often to assure that, if multiple tables are to be impacted by a single transaction, they all get the same date/time applied. I like consistency like that.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain,

    Your workout was really helpful and of course i will be switching to your sample as there is no need of temp tables to hold the data. I was extra cautious and tried to get the IDcompAFR from temp table. By the way, your solution is clean and short.

    I will be testing it with multiple scenario and start using your logic.

    thank you so much for your time on this.

  • dwain.c (1/25/2015)

    coalesce(a.processStatus, 0) = 0

    is not SARGable, so that could be improved on by making the processStatus column NOT NULL.

    It's much better to code it as:

    (a.processStatus is null or a.processStatus = 0)

    When an index is available, SQL can still do a seek for the code above.

    In short, the rule is:

    NEVER use ISNULL/COALESCE in a WHERE or JOIN clause; you can always code around it.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (1/26/2015)


    dwain.c (1/25/2015)

    coalesce(a.processStatus, 0) = 0

    is not SARGable, so that could be improved on by making the processStatus column NOT NULL.

    It's much better to code it as:

    (a.processStatus is null or a.processStatus = 0)

    When an index is available, SQL can still do a seek for the code above.

    In short, the rule is:

    NEVER use ISNULL/COALESCE in a WHERE or JOIN clause; you can always code around it.

    I mentioned it thinking that ultimately he'd need to resort to some kind of indexing to speed this up. I probably should have mentioned that.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Scott,

    thanks for your suggestion and i already removed coalesce because IsProcessed column is not null. Even i mentioned in previous thread.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply