Cursor Insert into table with IDENTITY crisis

  • I have been coming at this problem from several different angles, off and on, for a few weeks now.  This is now fast approaching a project deadline and I am swallowing my pride to ask for help.  Here is the setup:

    1. Query data from two different tables (ssf_widget_history, ssf_widget_checkpoint) that share a PK and FK between them, via linked server (A).  (No problem)
    2. Take that data from server A (source) and insert it into the (almost) matching tables on server B (destination).  
    3. The insert requires that you generate a new unique identity (checkpoint_id) in the table 'ssf_widget_checkpoint'.  As you are inserting the data from server A into that new identity    row, you must also insert the data captured from server A into 'ssf_widget_history', while maintaining the relationship between the legacy data, as you insert it into both your destination tables on server B.

    My biggest issue now, is that I'm getting duplicate key errors.  I suspect that I am calling the same identity value (checkpoint_id) for each row.
    I hope I'm making sense here.  My brain is fried.  I don't do complex coding very often, so I typically forget a lot of what I used to know.  🙂
    Any help would be appreciated.  I am not opposed to breaking this out into individual stored procedures, etc..  As I mentioned, I've tried many approaches.

    USE realtime
    GO

    SET NOCOUNT ON

    BEGIN
    -- create temp table for cassette history 
        IF OBJECT_ID (N'#ssf_widget_history_acme', N'U') IS NOT NULL
        DROP TABLE #ssf_widget_history_acme;

    CREATE TABLE #ssf_widget_history_acme
    (
    checkpoint_id        BIGINT,
    aa_server_nr        INT,
    widget_id            VARCHAR(20),
    item_count            INT,
    item_diverted        INT,
    item_value            NUMERIC(19,0),
    type_id                INT,
    item_begin_count    INT,
    replenishment_op    VARCHAR(15),
    item_deposited        INT,
    term_id                CHAR(8),
    date_time            DATETIME,
    checkpoint_type        INT
    )
    -- populate acme cassette history table from USP

    DECLARE @term_id    CHAR(8)
    SET @term_id = 'CT000001' -- will need a cursor to run through a acme list/table

    INSERT INTO #ssf_widget_history_acme
        SELECT
         h.checkpoint_id,
         '0',
         h.widget_id,
         h.item_count,
         h.item_diverted,
         h.item_value,
         h.type_id,
         h.item_begin_count,
         NULL,
         '0',
         c.term_id,
         c.date_time,
         c.checkpoint_type
        FROM ITXETRT01.postilion.dbo.ssf_widget_history h,
             ITXETRT01.postilion.dbo.ssf_widget_checkpoint c
        WHERE h.checkpoint_id = c.checkpoint_id
        AND c.term_id = @term_id
        AND (h.widget_id <> 'RET' AND h.type_id = '1')
        ORDER BY h.checkpoint_id

    --select * from #ssf_widget_history_acme

    --===============================================================================================================================
    -- Step 1 Delete existing media cassette history
    DELETE ssf_widget_history
    WHERE checkpoint_id IN
    (SELECT checkpoint_id FROM ssf_widget_checkpoint(NOLOCK)
    WHERE term_id = @term_id)

    -- Step 2 Delete existing media checkpoints
    DELETE ssf_widget_checkpoint
    WHERE term_id = @term_id

    -- variables for media checkpoint insert
    DECLARE @checkpoint_id        BIGINT
    DECLARE @date_time            DATETIME
    DECLARE @checkpoint_type    INT
    DECLARE    @aa_server_nr   INT
    --  @term_id already declared above

    -- variables for cassette history insert
    DECLARE @widget_id        INT
    DECLARE @item_count            INT
    DECLARE @item_diverted        INT
    DECLARE @item_deposited        INT
    DECLARE @item_value            NUMERIC(19,0)
    DECLARE @type_id            INT
    DECLARE @item_begin_count    INT
    DECLARE @replenishment_op    INT

    -- start cursor script
    DECLARE @rowcount_insert_total INT
    SET @rowcount_insert_total = 0

    -- declare cursor for inserts
    DECLARE #acme_history_insert
    CURSOR FOR
        SELECT
        checkpoint_id,
        aa_server_nr,
        widget_id,
        item_count,
        item_diverted,
        item_value,
        type_id,
        item_begin_count,
        replenishment_op,
        item_deposited,
        term_id,
        date_time,
        checkpoint_type
        FROM #ssf_widget_history_acme
        WHERE term_id = @term_id

    OPEN #acme_history_insert
        FETCH NEXT FROM #acme_history_insert
        INTO
        @checkpoint_id,
        @aa_server_nr,
        @widget_id,
        @item_count,
        @item_diverted,
        @item_value,
        @type_id,
        @item_begin_count,
        @replenishment_op,
        @item_deposited,
        @term_id,
        @date_time,
        @checkpoint_type

    -- media checkpoint insert
    BEGIN TRANSACTION cts_cfg

        SELECT
        @checkpoint_id = MAX(checkpoint_id)
        FROM ssf_widget_checkpoint WITH (NOLOCK)

        IF @checkpoint_id IS NULL
        BEGIN
            SET @checkpoint_id = 0
        END

        SET @checkpoint_id = @checkpoint_id+1
        IF @checkpoint_id = 9223372036854775807
        BEGIN
            SET @checkpoint_id = 1
        END

        BEGIN TRY
            INSERT INTO
                ssf_widget_checkpoint WITH (ROWLOCK)
                (
                checkpoint_id,
                aa_server_nr,
                term_id,
                date_time,
                checkpoint_type
                )
            VALUES
             (
                @checkpoint_id,
                @aa_server_nr,
                @term_id,
                @date_time,
                @checkpoint_type
                )
        END TRY
        BEGIN CATCH
        EXECUTE ctseft.ctseft_sp_tm_print_error
        ROLLBACK TRANSACTION cts_cfg
        END CATCH

    -- Cassette history insert statement here
    --start insert statement here
        
            BEGIN TRY
            INSERT INTO
                ssf_widget_history WITH (ROWLOCK)
                (
                checkpoint_id,
                aa_server_nr,
                widget_id,
                item_count,    
                item_diverted,
                item_deposited,
                item_value,
                type_id,
                item_begin_count,
                replenishment_op
                )
            SELECT
                @checkpoint_id,
                @aa_server_nr,
                @widget_id,
                @item_count,
                @item_diverted,
                @item_deposited,
                @item_value,
                @type_id,
                @item_begin_count,
                @replenishment_op
            FROM #ssf_widget_history_acme
            WHERE term_id = @term_id
            END TRY
                BEGIN CATCH
                    EXECUTE ctseft.ctseft_sp_tm_print_error
                END CATCH
                ROLLBACK TRANSACTION cts_cfg
                    CLOSE #acme_history_insert
                    PRINT '#acme_history_insert CLOSED'
                    DEALLOCATE #acme_history_insert
                    PRINT '#acme_history_insert DEALLOCATED'
    --        COMMIT TRANSACTION cts_cfg    

            PRINT '@term_id checkpoint inserted =' + @term_id + ', ' + CONVERT(VARCHAR, @date_time) + ', ' + CONVERT(VARCHAR, @checkpoint_type) + '='
            PRINT 'ROWCOUNT TOTAL = ' + CONVERT(VARCHAR, @rowcount_insert_total)
            WAITFOR DELAY '00:00:00:001'
            
            
            FETCH NEXT FROM #acme_history_insert
            INTO
            @checkpoint_id,
            @aa_server_nr,
            @widget_id,
            @item_count,
            @item_diverted,
            @item_value,
            @type_id,
            @item_begin_count,
            @replenishment_op,
            @item_deposited,
            @term_id,
            @date_time,
            @checkpoint_type
    END    
            CLOSE #acme_history_insert
            PRINT '#acme_history_insert CLOSED'
            DEALLOCATE #acme_history_insert
            PRINT '#acme_history_insert DEALLOCATED'
    GO

    Here is the Error output:
    CTSEFT Term Mgmt: Error 2627, Severity 14, State 1, Procedure -, Line 169
    Violation of PRIMARY KEY constraint 'p_ssfmedcasshist'. Cannot insert duplicate key in object 'dbo.ssf_widget_history'. The duplicate key value is (2047254, 0, 1).
    #wegmans_history_insert CLOSED
    #wegmans_history_insert DEALLOCATED
    ROWCOUNT TOTAL = 0

  • 1) WITH (NOLOCK) allows for dirty reads. So you could have other processes running that are creating conflicts. Concurrency is a tough thing to deal with.

    2) I don't see a need for a cursor at all here either. You can use ROW_NUMBER to get a sequence to add to the MAX on insert.

    3) Why are you naming a cursor with a #?

    4) If you DO use a cursor, it should declared FAST_FORWARD for efficiency. Still WAY slower than a set-based solution though.

    5) You can also do the history insert in the same step as the other insert using the OUTPUT clause.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thank you for the response.  I do appreciate it!  Just to give my pointless replies. 🙂

    1)Dirty reading is fine in this instance.  The data being read will not change (no input), while the table is being updated/inserted constantly.  Risking locks would cause performance issues for the application using the dbase.

    2) That's actually a good idea!

    3) .I know, it's a bad habit.  Can get confusing with temp tables.  The guy I learned from did it this way and would re-write my code, so I just started doing it his way.  He's recently gone, so....

    4) Again, learned bad habit.  🙂

    5) Now this!  This is money!  I will try this.  Thank again.

    This is a one time data migration for a one time retrieval by a very demanding tyrant of a customer.  So my failsafe is to just create a dump table and have their queries redirected to that legacy data on the new server.  Once that's done, it will vanish like a ghost..  >:-)

  • TriggerMan - Saturday, March 4, 2017 9:35 PM

    Thank you for the response.  I do appreciate it!  Just to give my pointless replies. 🙂

    1)Dirty reading is fine in this instance.  The data being read will not change (no input), while the table is being updated/inserted constantly.  Risking locks would cause performance issues for the application using the dbase.

    But you are missing the point. Without HOLDING A LOCK on that table when you get the MAX (until you do your subsequent INSERT) means any other process (those constant UPDATE/INSERTs) can create a duplicate key.

    BTW, having to get a MAX to increment to do a counter is a HORRIBLE construct.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • "Without HOLDING A LOCK on that table when you get the MAX (until you do your subsequent INSERT) means any other process (those constant UPDATE/INSERTs) can create a duplicate key."
    That makes sense...forest for the trees.  That's why I posted here.

    "BTW, having to get a MAX to increment to do a counter is a HORRIBLE construct." 
    I assumed the developers of the product to be smarter than myself, so I followed their lead.  I understand it doesn't make it right.  So what is the perfect way to grab the MAX value?

  • TriggerMan - Monday, March 6, 2017 7:40 AM

    "Without HOLDING A LOCK on that table when you get the MAX (until you do your subsequent INSERT) means any other process (those constant UPDATE/INSERTs) can create a duplicate key."
    That makes sense...forest for the trees.  That's why I posted here.

    "BTW, having to get a MAX to increment to do a counter is a HORRIBLE construct." 
    I assumed the developers of the product to be smarter than myself, so I followed their lead.  I understand it doesn't make it right.  So what is the perfect way to grab the MAX value?

    Or should I ask, what's the perfect way to increment without using the MAX value?

  • Do you have any control over the structure of the ssf_widget_history table?  If so, why not just put an identity property on the checkpoint_id column, and let the table do the incrementing for you?

    John

  • John Mitchell-245523 - Monday, March 6, 2017 7:55 AM

    Do you have any control over the structure of the ssf_widget_history table?  If so, why not just put an identity property on the checkpoint_id column, and let the table do the incrementing for you?

    John

    That's the one way, although it will require application code changes to anything that does an INSERT to the table. A SEQUENCE would work too (and require code changes too).

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TriggerMan - Friday, March 3, 2017 3:00 PM

    >> 3. The insert requires that you generate a new unique identity (checkpoint_id) in the table 'ssf_widget_checkpoint'. As you are inserting the data from server A into that new identity row [sic] , you must also insert the data captured from server A into 'ssf_widget_history', while maintaining the relationship between the legacy data, as you insert it into both your destination tables on server B. <<

    Based on only what you've given us, my first impulse is to tell you to look at the CREATE SEQUENCE statement. This is not like the IDENTITY table property (no, it's not a column and never can be). This will give you a sequence of unique identifiers that could actually have some meaning in the data model, as opposed to being a physical property of storage in the table.

    I assume that you actually know your #ssf_widget_history_acme is not a table at all and can never be a table because everything is noble and it has no primary key.

    The minute we see things like "will need a cursor to run through a acme list/table" we do get sick. Oh yeah. Do you also post recipes for fried babies on vegan with websites? 🙂

    Personally, I get sick (Walmart cannibalism sick) over crap like "type_id" because ISO 11179 tells us that "_type" and "_id" our attribute properties and must be the postfix of an attribute name. We don't know what the type is we don't know what entities being identified, it's just a pile of adjectives without a noun and it really looks that stupid to a data modeler..

    Other posters have hit you on some other design flaws you have in this model. Basically, you using SQL to to mimic how you would have done this in 1950's COBOL filesystems, with procedural code.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Not to pick a fight, but what if his company doesn't follow ISO 11179 naming conventions?  Or the third party software he purchased doesn't follow those?  Heck, the ISO 11179 standards don't even mention SQL Server.  ISO 11179 mentiones databases, but most of what it mentions in relation to those is metadata.  A quick check through his code I can see that @type_id is an int and he assigned it the value of type_id.  Pretty easy to tell that type_id is an int.
    I do agree readability makes it nicer to have it named something like "intTypeID", but we don't always get the benefit of that with inherited systems.

    And isn't #ssf_widget_history_acme a table even without a primary key?  I've seen a lot of tables without primary keys or indexes - they are called heaps.  It is still a table.

    Microsoft doesn't have any naming convention standards and as far as I can tell, each company I've worked for defines their own standards.  

    That being said, I'll get back on topic.  I would avoid using nolock if at all possible.  You can end up with duplicate results or missing results using it.  Also, is the WITH (ROWLOCK) required?
    Last thing - your repopulating the cursor feels mostly pointless.  You do not have it in a loop (that I see... but I could be blind), so only the first value from the cursor is ever going to be used, no?

    My guess about the deadlock is that you are not committing your transaction so your changes are not actually being written permanently to the database.  You will also get better performance if you don't use implicit datatype convertion.  I am referring to the "h.type_id = '1'" part.  Since we know type_id is an INT, you could replace that with "h.type_id = 1".

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Since this is a SQL 2008 Forum, I assume that you are using SQL 2008.  In this case a SEQUENCE object will not work.

    You could use a watermark table and simulate a sequence object.

    create table dbo.Watermark (
      WhatForID int   not null primary key clustered
    , WhatFor  varchar(50) not null
    , NextNumber int   not null
    );
    GO

    insert into dbo.Watermark ( WhatForID, WhatFor, NextNumber )
    values ( 1, 'Employee Number', 327 )
      , ( 2, 'Invoice Number', 41685 );
    GO

    create procedure dbo.GetNextNumber
      @WhatForID int
    , @Increment int = 1
    , @NextNum int output
    as
    begin
    set nocount on;

    -- Use QuirkyUpdate to get the next number
    UPDATE dbo.Watermark
    SET @NextNum = NextNumber = NextNumber + @Increment
    FROM dbo.Watermark
    where WhatForID = @WhatForID;

    end;
    GO

    Usage for single number increments

    declare @NextEmployeeNumber int;
    declare @NextInvoiceNumber int;

    exec dbo.GetNextNumber @WhatForID=1, @Increment=1, @NextNum=@NextEmployeeNumber output;
    exec dbo.GetNextNumber @WhatForID=2, @Increment=1, @NextNum=@NextInvoiceNumber output;

    select @NextEmployeeNumber AS NextEmployeeNumber
      , @NextInvoiceNumber AS NextInvoiceNumber

    Usage for range increments

    declare @MaxEmployeeNumber int;
    declare @NumberCount int = 5;

    exec dbo.GetNextNumber @WhatForID=1, @Increment=@NumberCount, @NextNum=@MaxEmployeeNumber output;

    select @MaxEmployeeNumber -@NumberCount +1 AS StartEmployeeNumber
      , @MaxEmployeeNumber AS EndEmployeeNumber

  • bmg002 nailed it!  99% of the naming conventions and code are from the software vendor.  Did you ever hear the phrase "..changing a horse midstream?" If I ended up going to the vendor for advice (save the comments), it would be more familiar to them and we wouldn't waste time rewriting, etc..  I'm pretty sure I made it clear as well that I'm not that experienced.  Why?  I'm not a DBA or a developer.

    bmg002 - the ROWLOCK and NOLOCKS came from the vendor.  I assumed they know best since they understand the big picture on what and how tables are used by other processes.  I actually went a completely different direction, creating a separate stored procedure (with no cursor) to do the lookup/insert/update as needed on a case by case basis.

  • Just an update....

    SELECT
    @checkpoint_id = MAX(checkpoint_id)
    FROM ssf_widget_checkpoint WITH (NOLOCK)

    So this is the primary issue. Ironically, we started seeing the vendors "checkpoint insert job" start to intermittently fail due with constraint errors on the checkpoint_id column. We believe it is due to the dirty read of the NOLOCK. This is not the code I provided, but as I mentioned, code mine was based on. So we'll be opening a trouble ticket to them for that, so their next patch doesn't overwrite our fix.    🙂

    And to clarify a few things...

    - Yes, SQL 2008 R2

    - #ssf_widget_history_acme is a temporary table that is holding data inserted from two different tables. The 'checkpoint_id' column is a primary key between those two other tables and is unique.

    - @checkpoint_id must be derived from ssf_widget_checkpoint by deriving the next available value from the column. This means I cannot increment to make up whatever number I want. The software has to be aware of it as a marker/placeholder. It's not actually and IDENTITY column. I only meant to imply that it had similar characteristics. There are other calls attempting to grab the next available checkpoint_id, which is why the vendors job is failing now because of the NOLOCK (we believe).

    Thank you all again for your feedback.  It has been very helpful.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply