Row-By-Row Processing Without Cursor

  • Hey everyone, I haven't posted in a while, so I thought I'd chime in on a good one like this. I've used alot of the suggestions in the past with varying results. It's really hard to design business rules with triggers when there are so many inputs and outputs. This is more and more apparent to me every day in the health industry.

    I got an idea reading through this discussion...I want to see what yall think about it

    CREATE TRIGGER tr1 ON Books AFTER INSERT AS

    UPDATE Books

    SET batchUID = NEWID() --batchUID being a uniqueidentifier field

    FROM Books

    INNER JOIN Inserted

    ON Inserted.bookCode = Books.bookCode

    GO

    CREATE TRIGGER tr2 ON Books AFTER INSERT AS

    DECLARE @batchUID uniqueidentifier

    SET @batchUID = (SELECT TOP 1 @batchUID FROM Inserted)

    EXEC usp_Process @batchUID=@batchUID

    GO

    Of course this only works if a.) you can alter the table schema and b.) if you can alter stored procedures to handle optional parameters. The only other exceptions I can see if is the SP uses a non-batch API or you need to generate some type of customized ID field.

  • From just looking at it, it seems a bit tricky.  Are you planning on recursive triggers?  Also,  not sure what happens in usp_Process, but it looks like you are only getting the first batchid that is inserted and calling that proc (what if >1 is inserted).  I may be overlooking something because I just gave it an overview, but those are my thoughts.  Does it work like you expect?

  • correct...no recursive triggers (which is default for SQL Server). The first trigger assigns the same GUID to each record in the insert. So the author of usp_Process has to code based on batchUID instead of bookCode.

  • What if there are >1 record inserted?

  • --This code affects all the records inserted into books with the same uniqueidentifier

    CREATE TRIGGER tr1 ON Books AFTER INSERT AS

    UPDATE Books

    SET batchUID = NEWID() --batchUID being a uniqueidentifier field

    FROM Books

    INNER JOIN Inserted

    ON Inserted.bookCode = Books.bookCode

    GO

  • What about trigger tr2?

  • I just realized I made a boo boo...not a big one in tr2

    SET @batchUID =

    (

    SELECT TOP 1 batchUID

    FROM Books

    INNER JOIN Inserted

    ON Inserted.bookCode = Books.bookCode

    )

    Anyhow...since each inserted record gets the same batchUID in tr1, then any query outside the trigger structure can batch process the records using batchUID instead of bookCode. You may want to create a nonclustered index on batchUID. SP process may look something like this

    CREATE PROCEDURE usp_BookSetDefaults

    @bookCode int = null,

    @batchUID uniqueidentifier = null

    AS

    UPDATE Books

    SET value1 = BookPrimer.defaultValue

    FROM Books

    INNER JOIN BookPrimer

    ON BookPrimer.bookTypeID = Books.bookTypeID

    WHERE Book.bookCode = ISNULL(@bookCode, Book.bookCode)

    AND

    (

    @batchUID IS NULL

    OR Book.batchUID = @batchUID

    )

  • I gotcha.  Didn't realize you were going to have an outside process perform some work on the dataset.  BTW...are you using SQL 2000, or 2005?  If you are using 2005, something neat you could try is to create a Service Broker app, pass something into a message, such as your NEWID(), and have that run some functionality in the backgrond (asynchronously) while your trigger can take on other inserts, etc.  Does that make sense?

  • I am wary of anything that involves string manipulation because strings are immutable.

    If you do something that says @YourString = @YourString + 'Some text' then what really happens is that

    @YourString is destroyed then recreated to hold @YourString + 'Some text'

    That is why there is a StringBuilder class in .NET.

    Instead of constantly destroying and recreating strings it simply creates a load of separate strings and then concatenates them at the end.

    I have had situations where cursors were quickest and easiest to code, but ultimately not the most efficient way of doing things.

    For small sets of data the hit from a cursor is negligible, but once you get in the millions of records a cursor is a horror.

  • SQL Server 2000. Triggers are not (as far as I know) threads waiting for an event, so I do not believe the asynchronous aspect is a problem. The over-all theme here, is the business rule chaining: A-->B-->[C-->D],E... Meaning that there is a precedense architecture.

    Once again, cursor, create string, and loop-thru temp table solutions all have their merits. I am tempted to try out the create string solution. At the least it is very clever. If anyone decides they want to use a cursor using the following options will cause you alot less headaches: LOCAL FORWARD_ONLY STATIC READ_ONLY.

    On a side note as far as the chaining aspect, be sure to design your row locking mechanisms carefully. When first experimenting with the NOLOCK, ROWLOCK, and TABLOCK options, I updated a table in one trigger, In a second trigger I tried updating another table using the first with a NOLOCK. The second query never waited for the transaction in the first trigger to complete.

    Back to the batch topic, I found that spending the time to rethink a linear process into a batch process can save more than 100% in efficiency. If there 100 records and a trigger that causes 3 triggers to fire each of which causes 3 triggers to fire, in a batch process 1 + (3 * 3)= 10 triggers fire opposed to 1 + (100 * 3 * 3) = 901 triggers fired. When your queries inside the triggers reference thousands of records the effeciency starts degregating pretty quickly. I hope you get the idea.

  • Yes, but you won't be able to insert another record until that trigger has finished firing, so that is a consideration.  I suggested using a SB program so that you can add your newid() in a table and later have a program tend to those records.  These operations fire in the background, while allowing you to enter more records into your table, fire the trigger, and add a new message to a SB queue.  Does that make sense?

    In my chaining solutions, the 2nd query will only run after the 1st has finished completing, so row locks are not an issue for me because it is not within a large transaction.  Works like a charm. 

     

  • Solve a cursor problem with the next most misused feature of SQL Server (triggers)????

    If you really, really, absolutely have to do something a row at a time, use a loop with a SET ROWCOUNT 1 SELECT ... WHERE key > @lastkey ... It's always executed faster than cursors in my experience.

    The only thing there may be against this approach is ghosting, but that can be solved with temp tables.

     

  • if you wrap your insert inside a transaction, then you would need an SB. The SB would add alot of complexity in my mind. But it has been said before and I'm sure this is not last time, there's more than one way to skin a cat.

    As far as row locks go...it's a conversation for another day.

  • It really depends on scope as far as cursor loop or whatever.

    I chose a chain of triggers because I have several tables that are accessed in several different ways each. The behavior of actions on a table should always be the same so why replicate code for each access point? In a real-world environment when business rules change sometimes daily and sometimes even hourly (Execs can be very indecisive at times) the one-point place for editing makes you look really good.

  • No doubt cursors have their place. However, in my opinion, a lot of users use cursors because that is the only way they know to loop through record by record. I have also seen developers writing codes where a SP or function is being called from inside a cursor which again opens up another cursor because that code also required row-by-row processing.

    From my experience I have noticed that a good amount of daily activities presently being done using cursors can be written using the TABLE variable. Table variable will not only do the job that cursors are being used for, it will also save time because there is no disk I/O. All is in memory. Plus one can go forward and backward in the resultset and at the same time use this as any other table for performing joins.

    TABLE variables stay in the memory in the context of the batch, hence memory management may not be an issue either. At least I have not see any memory issue so far.

    I would like to know if any of you have had any bad experience due to use of TABLE variable in place of cursors.

     

    Thanks

    Anik

Viewing 15 posts - 16 through 30 (of 70 total)

You must be logged in to reply to this topic. Login to reply