Working with Queues in SQL Server

  • Ahmad Alkilani

    SSC Enthusiast

    Points: 127

    Comments posted to this topic are about the item Working with Queues in SQL Server

  • Adrian Hains

    Ten Centuries

    Points: 1322

    FYI - the OUTPUT clause is available in sql2k5 as well.

    -a

  • Paul White

    SSC Guru

    Points: 150442

    Setup:

    -- For the demo

    USE tempdb;

    GO

    IF OBJECT_ID(N'dbo.WorkQueue', N'U')

    IS NOT NULL

    DROP TABLE dbo.WorkQueue;

    IF OBJECT_ID(N'dbo.GetWorkItem', N'P')

    IS NOT NULL

    DROP PROCEDURE dbo.GetWorkItem;

    IF OBJECT_ID(N'dbo.SetWorkItemComplete', N'P')

    IS NOT NULL

    DROP PROCEDURE dbo.SetWorkItemComplete;

    -- Create table

    CREATE TABLE dbo.WorkQueue

    (

    work_queue_id INTEGER IDENTITY(1,1) NOT NULL

    PRIMARY KEY,

    name VARCHAR(255) NOT NULL,

    status_id TINYINT NOT NULL

    CHECK (status_id IN (0, 1, 2)),

    status_desc AS

    ISNULL

    (

    CASE status_id

    WHEN 0 THEN 'Unprocessed'

    WHEN 1 THEN 'In progress'

    WHEN 2 THEN 'Processed'

    ELSE NULL

    END,

    '')

    );

    -- Add data to table

    INSERT dbo.WorkQueue

    (name, status_id)

    SELECT 'A', 0 UNION ALL

    SELECT 'B', 0 UNION ALL

    SELECT 'C', 0 UNION ALL

    SELECT 'D', 0 UNION ALL

    SELECT 'E', 0 UNION ALL

    SELECT 'F', 0 UNION ALL

    SELECT 'G', 0 UNION ALL

    SELECT 'H', 0 UNION ALL

    SELECT 'I', 0 UNION ALL

    SELECT 'J', 0 UNION ALL

    SELECT 'K', 0;

    GO

    -- Index to help find unprocessed items

    CREATE NONCLUSTERED INDEX nc1

    ON dbo.WorkQueue (status_id);

    Procedures:

    CREATE PROCEDURE dbo.GetWorkItem

    @WorkQueueID INTEGER OUTPUT

    AS

    BEGIN

    SET NOCOUNT ON;

    -- Allocate a work item and set it to in-progress

    -- TODO: Add lock timeout and error handler

    UPDATE Q

    SET status_id = 1,

    @WorkQueueID = work_queue_id

    FROM (

    SELECT TOP (1)

    Q2.work_queue_id,

    Q2.status_id

    FROM dbo.WorkQueue Q2 WITH (UPDLOCK, READPAST, ROWLOCK)

    WHERE Q2.status_id = 0

    ORDER BY

    Q2.work_queue_id ASC

    ) Q;

    RETURN CASE WHEN @WorkQueueID IS NULL THEN 999 ELSE 0 END

    END;

    GO

    CREATE PROCEDURE dbo.SetWorkItemComplete

    @WorkQueueID INTEGER

    AS

    BEGIN

    SET NOCOUNT ON;

    -- Set the work item's status to complete

    -- TODO: Add lock timeout and error handler

    UPDATE Q

    SET status_id = 2

    FROM WorkQueue Q

    WHERE Q.work_queue_id = @WorkQueueID

    AND Q.status_id = 1;

    RETURN CASE WHEN @@ROWCOUNT = 0 THEN 999 ELSE 0 END;

    END;

    GO

    Test:

    DECLARE @WorkItemID INTEGER,

    @ReturnCode INTEGER;

    -- Get a work item

    EXECUTE @ReturnCode =

    dbo.GetWorkItem @WorkItemID OUTPUT;

    -- Show the record

    SELECT rc = @ReturnCode,

    *

    FROM dbo.WorkQueue

    WHERE work_queue_id = @WorkItemID;

    -- Simulate processing delay

    WAITFOR DELAY '00:00:03';

    -- Set the work item to complete

    EXECUTE @ReturnCode =

    dbo.SetWorkItemComplete @WorkItemID;

    -- Show the record

    SELECT rc = @ReturnCode,

    *

    FROM dbo.WorkQueue

    WHERE work_queue_id = @WorkItemID;

    GO

    DROP TABLE dbo.WorkQueue;

    DROP PROCEDURE dbo.GetWorkItem;

    DROP PROCEDURE dbo.SetWorkItemComplete;

    Results:

    rc work_queue_id name status_id status_desc

    0 1 A 1 In progress

    ...

    0 1 A 2 Processed

    Allocation routine query plan: (compute scalars removed)

  • HansVE

    Mr or Mrs. 500

    Points: 524

    Much better article at http://rusanu.com/2010/03/26/using-tables-as-queues/

  • Bill Nicolich

    SSCommitted

    Points: 1865

    Hey folks. I recently built a system where users share a pool of work items and I too use a table-based queue. So I feel kind of familiar with some of the issues.

    It seems to me that the proposed solution handles the initial concurrency problem well. There's no chance that two people will get the same open item in the queue. So it solves what I call the reservation problem. You want users to be able to reserve or lock an item.

    However, another problem is getting disconnected. It's the "orphaned work item" issue. The calling app, if a web application, for instance - may drop the connection. Then a row may stay marked as processed/reserved and stay that way. In order to handle these situations, you either have to ensure that it doesn't happen up-front (sometimes not possible) - or have a way to identify rows not being worked by a user.

    I've used the table-based method of reserving rows for a work queue with a web application front-end. There are occasions when a session end or the user gets disconnected. I handle the getting disconnected problem by programming the application "session end" event. In the event, I "un-mark" the row in question using the application. That works most of the time - but there are occasions when that event isn't called - and I then have to un-mark those rows another way.

    Now, when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user. Due to this problem, I chose to do something slightly different than adding a flag column and using the query hints.

    I created an additional history table for item reservations. I add a row when a work item is reserved by a user. It includes a session id, item id, expiration flag, start date and end date. The bi-temporal approach allows me to automatically "unlock" items within a specified period of time - such as a day.

    It's actually nice to keep a history table for item reservations - and in some cases, it can help satisfy "chain of custody" requirements. It is also a way to handle the orphaned work item issue.

    I welcome your thoughts and critiques. Thanks.

    Bill Nicolich: www.SQLFave.com.
    Daily tweet of what's new and interesting: AppendNow

  • noeld

    SSC Guru

    Points: 96590

    One more comment: The use of READPAST should be carefully considered for two reasons:

    1. If you need to preserve the order you must exclude it

    2. If you are using READ_COMMITED_SNAPSHOT ON at the database level and The transaction isolation level of the session is READ COMMITTED.


    * Noel

  • Paul White

    SSC Guru

    Points: 150442

    Bill Nicolich (4/5/2010)


    ...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.

    A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.

  • Bill Nicolich

    SSCommitted

    Points: 1865

    Paul White NZ (4/5/2010)


    Bill Nicolich (4/5/2010)


    ...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.

    A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.

    When you match up a work item with a user, there are some considerations. What constitutes a user? In my case, a user was a unique session id from the web server. It wasn't a web farm situation. A user could have IE open with one session and FireFox open with another session - and these in my case should be handled separately. So if I just had just a "user id" matching things up in the table, I'd run into problems. So, that's why I added both a session id and a user id in the table.

    Bill Nicolich: www.SQLFave.com.
    Daily tweet of what's new and interesting: AppendNow

  • Eric Strickland

    Valued Member

    Points: 65

    For the Demo,

    If doing this in 2008 and you just use the update "with (READPAST), If you do not move the Delay to before the commit Both sessions will pick up the same record.

    Eric.

  • peleg

    SSCrazy

    Points: 2611

    when i first looked at the title articles, i was sure it talkes about queues, that work in FIFO - first in -> first out.

    but when i saw the row

    SELECT TOP(1)

    @work_queue_id = work_queue_id, @name = name

    FROM

    work_queue

    WHERE processed_flag = 0

    it dosen't ensure me that i will get the first available row from the table!

    maybe it will be better to work with the Queue that the sql provides you, rather then using a table, and getting into situation where you find your self with looked rows and so on like Bill said.

  • Ahmad Alkilani

    SSC Enthusiast

    Points: 127

    The solution simply illustrates in general what you can do to get around the problem. You can most certainly add ordering, user audits and resolutions to orphaned items. Then again, if you have a necessity to solve these issues in your application you might want to consider going with SQL Server Service Broker 🙂

  • SQLRNNR

    SSC Guru

    Points: 281210

    Nice Article, and better discussion.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    We used to do something similar for sending email here on the site. we had multiple machines that sent, and needed to read a database table to get their list of "things to send".

    Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.

    Upon the completion of processing, the client updated each row to show it as processed and then started over with another update of new rows in the Q

  • Paul White

    SSC Guru

    Points: 150442

    Steve Jones - Editor (4/5/2010)


    Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.

    This is exactly the pattern used at two places I have worked, and the basis for the code posted earlier in the thread. Of course all that was before the days of Service Broker, but with careful coding and consideration for the concurrency and locking issues involved, it can be made to work very well.

  • tony.turner

    Right there with Babe

    Points: 783

    In my view the code fragment from Ahmad is missing a number of considerations that should be taken into account for a practical SQL table-based queueing systems:

    * Polling SQL table based queues, as opposed to Service Broker, scales poorly. If you have multiple queues, or you need to keep the queue latency down, then Service Broker (or probably MSMQ) is by far your better answer. Assume 20 queues, and you want to keep the average enqueue time to process time below 50 milliseconds, you may be issuing 200 individual SQL requests a second, incurring a substantial constant overhead

    * In a practical environment you should probably only COMMIT the change to the queue entry once you have fully processed the entry. Your SQL 2008 code specifically precludes doing this. If you COMMIT before processing the queue entry you need some way of tracking orphaned entries, which would include updating the queue entry with process identification, date/time, etc before COMMIT and a separately executing process to detect of orphaned entries and requeue them. This also raises the whole question of how long you can afford an orphaned entry to be orphaned for before it is detected, potential duplicate processing reversed, and finally processed

    * Probably the most vexed issue in queue processing is poison messages; messages that, for whatever reason, your code cannot handle. You cannot realistically keep the status of the queue entry in the queue entry itself because you may ROLLBACK rather than COMMIT and process the entry repeatedlly. In my view the simplest answer is two connections; one is transacted and handles the queue, one is untransacted and keeps track of the status of the queue entry (number of retries, etc). In retrieving a queue entry you should retrieve both the queue entry and its associated status entry, and only retrieve entries which have not been re-processed excessively

    * The READPAST locking hint may skip messages not only if the row is locked (busy being INSERTed) but also if the index is busy. The nett effect is that you may retrieve no messages even though there are messages in the queue, or you may process messages out-of-sequence. An associated issue is that performance may degrade and your apparent queue become less stable as the number of handled queue messages increases. A better approach may be to DELETE the queue entry on processing completion rather than mark it as processed, and keep a separate history log

    So, in summary. Simple, yes. Realistic, not close

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply