• richkdcpud (2/11/2016)


    Evironment:

    SQL2012

    Virtual machine running VMWare and running on its own host

    Windows Server 2012 R2

    96GB Ram

    A developer wrote the following Stored Procedure to insert data into a table. He is seeing longer than expected execution times, on average about 5 seconds. The data is being read from an Excel file and the SP is being run for each record, about 2500 times. Do you think this is an unreasonable amount of time for the execution of the SP?

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER PROCEDURE [dbo].[Save_Hourly]

    @timestamp_utc as datetime,

    @timestamp_local as datetime,

    @hour_ending_local as int,

    @calculated_at as datetime,

    @account_id as int,

    @value as decimal(18,0)

    AS

    DECLARE @hourly_id integer

    DECLARE @last_calculated_at datetime

    DECLARE @last_value decimal(18,0)

    DECLARE @getdate-2 datetime

    SET @getdate-2 = GETDATE()

    -- check to see if a record already exists for the current utc timestamp

    SELECT @hourly_id = id, @last_value = value, @last_calculated_at = calculated_at FROM hourlies WHERE timestamp_utc = @timestamp_utc AND account_id = @account_id

    IF (@hourly_id IS NULL)

    BEGIN

    INSERT INTO [hourlies]

    (

    [timestamp_local],

    [account_id],

    [value],

    [created_at],

    [updated_at],

    [timestamp_utc],

    [hour_ending_local],

    [calculated_at]

    )

    VALUES

    (

    @timestamp_local,

    @account_id,

    @value,

    @getdate-2,

    @getdate-2,

    @timestamp_utc,

    @hour_ending_local,

    @calculated_at

    )

    SELECT SCOPE_IDENTITY()

    END

    ELSE IF ((@last_calculated_at IS NULL OR @last_calculated_at <= @calculated_at) AND @last_value != @value)

    BEGIN

    UPDATE [hourlies] SET

    [value] = @value,

    [updated_at] = @getdate-2,

    [hour_ending_local] = @hour_ending_local,

    [calculated_at] = @calculated_at

    WHERE [id] = @hourly_id

    SELECT @hourly_id

    END

    The performance part is almost certainly easy to resolve: you are either being blocked or more likely you are scanning a table to get the initial SELECT (and possibly a second time for any UPDATEs. There could also be triggers or un-indexed foreign keys I suppose, and you VM could be totally underpowered.

    BUT, the real problems here are worse than the perf issues. Firstly, it is just silly to take a spreadsheet and do UPSERTs to load the data one row at a time! Just bulk it in, 2 set based operations and the entire load is done in a few seconds flat, even with unindexed table scans. Worse is that you are open to the usual concurrency issues with the UPSERT mechanism chosen. In between the SELECT some other process could do something to delete the row found, or another process could insert a row that wasn't there a millisecond before. This is a classic pitfall in data processing. Hopefully you NEVER have concurrent activity when this load is happening.

    As a minor point I will add that SET NOCOUNT ON should be the first line of essentially every sproc in existence. And there is absolutely no error checking or transaction control to be seen here.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service