Working with Queues in SQL Server

Question

Post reply

Working with Queues in SQL Server

Ahmad Alkilani

SSC Enthusiast

Points: 127
More actions
April 3, 2010 at 12:43 pm

#386639

Comments posted to this topic are about the item Working with Queues in SQL Server

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Adrian Hains Ten Centuries Points: 1322 More actions · Answer 1

FYI - the OUTPUT clause is available in sql2k5 as well.

-a

Paul White SSC Guru Points: 150467 More actions · Answer 2

Setup:

-- For the demo

USE tempdb;

GO

IF OBJECT_ID(N'dbo.WorkQueue', N'U')

IS NOT NULL

DROP TABLE dbo.WorkQueue;

IF OBJECT_ID(N'dbo.GetWorkItem', N'P')

IS NOT NULL

DROP PROCEDURE dbo.GetWorkItem;

IF OBJECT_ID(N'dbo.SetWorkItemComplete', N'P')

IS NOT NULL

DROP PROCEDURE dbo.SetWorkItemComplete;

-- Create table

CREATE TABLE dbo.WorkQueue

(

work_queue_id INTEGER IDENTITY(1,1) NOT NULL

PRIMARY KEY,

name VARCHAR(255) NOT NULL,

status_id TINYINT NOT NULL

CHECK (status_id IN (0, 1, 2)),

status_desc AS

ISNULL

(

CASE status_id

WHEN 0 THEN 'Unprocessed'

WHEN 1 THEN 'In progress'

WHEN 2 THEN 'Processed'

ELSE NULL

END,

'')

);

-- Add data to table

INSERT dbo.WorkQueue

(name, status_id)

SELECT 'A', 0 UNION ALL

SELECT 'B', 0 UNION ALL

SELECT 'C', 0 UNION ALL

SELECT 'D', 0 UNION ALL

SELECT 'E', 0 UNION ALL

SELECT 'F', 0 UNION ALL

SELECT 'G', 0 UNION ALL

SELECT 'H', 0 UNION ALL

SELECT 'I', 0 UNION ALL

SELECT 'J', 0 UNION ALL

SELECT 'K', 0;

GO

-- Index to help find unprocessed items

CREATE NONCLUSTERED INDEX nc1

ON dbo.WorkQueue (status_id);

Procedures:

CREATE PROCEDURE dbo.GetWorkItem

@WorkQueueID INTEGER OUTPUT

AS

BEGIN

SET NOCOUNT ON;

-- Allocate a work item and set it to in-progress

-- TODO: Add lock timeout and error handler

UPDATE Q

SET status_id = 1,

@WorkQueueID = work_queue_id

FROM (

SELECT TOP (1)

Q2.work_queue_id,

Q2.status_id

FROM dbo.WorkQueue Q2 WITH (UPDLOCK, READPAST, ROWLOCK)

WHERE Q2.status_id = 0

ORDER BY

Q2.work_queue_id ASC

) Q;

RETURN CASE WHEN @WorkQueueID IS NULL THEN 999 ELSE 0 END

END;

GO

CREATE PROCEDURE dbo.SetWorkItemComplete

@WorkQueueID INTEGER

AS

BEGIN

SET NOCOUNT ON;

-- Set the work item's status to complete

-- TODO: Add lock timeout and error handler

UPDATE Q

SET status_id = 2

FROM WorkQueue Q

WHERE Q.work_queue_id = @WorkQueueID

AND Q.status_id = 1;

RETURN CASE WHEN @@ROWCOUNT = 0 THEN 999 ELSE 0 END;

END;

GO

Test:

DECLARE @WorkItemID INTEGER,

@ReturnCode INTEGER;

-- Get a work item

EXECUTE @ReturnCode =

dbo.GetWorkItem @WorkItemID OUTPUT;

-- Show the record

SELECT rc = @ReturnCode,

*

FROM dbo.WorkQueue

WHERE work_queue_id = @WorkItemID;

-- Simulate processing delay

WAITFOR DELAY '00:00:03';

-- Set the work item to complete

EXECUTE @ReturnCode =

dbo.SetWorkItemComplete @WorkItemID;

-- Show the record

SELECT rc = @ReturnCode,

*

FROM dbo.WorkQueue

WHERE work_queue_id = @WorkItemID;

GO

DROP TABLE dbo.WorkQueue;

DROP PROCEDURE dbo.GetWorkItem;

DROP PROCEDURE dbo.SetWorkItemComplete;

Results:

rc work_queue_id name status_id status_desc

0 1 A 1 In progress

...

0 1 A 2 Processed

Allocation routine query plan: (compute scalars removed)

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

HansVE Mr or Mrs. 500 Points: 524 More actions · Answer 3

Much better article at http://rusanu.com/2010/03/26/using-tables-as-queues/

Bill Nicolich SSCommitted Points: 1865 More actions · Answer 4

Hey folks. I recently built a system where users share a pool of work items and I too use a table-based queue. So I feel kind of familiar with some of the issues.

It seems to me that the proposed solution handles the initial concurrency problem well. There's no chance that two people will get the same open item in the queue. So it solves what I call the reservation problem. You want users to be able to reserve or lock an item.

However, another problem is getting disconnected. It's the "orphaned work item" issue. The calling app, if a web application, for instance - may drop the connection. Then a row may stay marked as processed/reserved and stay that way. In order to handle these situations, you either have to ensure that it doesn't happen up-front (sometimes not possible) - or have a way to identify rows not being worked by a user.

I've used the table-based method of reserving rows for a work queue with a web application front-end. There are occasions when a session end or the user gets disconnected. I handle the getting disconnected problem by programming the application "session end" event. In the event, I "un-mark" the row in question using the application. That works most of the time - but there are occasions when that event isn't called - and I then have to un-mark those rows another way.

Now, when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user. Due to this problem, I chose to do something slightly different than adding a flag column and using the query hints.

I created an additional history table for item reservations. I add a row when a work item is reserved by a user. It includes a session id, item id, expiration flag, start date and end date. The bi-temporal approach allows me to automatically "unlock" items within a specified period of time - such as a day.

It's actually nice to keep a history table for item reservations - and in some cases, it can help satisfy "chain of custody" requirements. It is also a way to handle the orphaned work item issue.

I welcome your thoughts and critiques. Thanks.

Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow

noeld SSC Guru Points: 96590 More actions · Answer 5

One more comment: The use of READPAST should be carefully considered for two reasons:

1. If you need to preserve the order you must exclude it

2. If you are using READ_COMMITED_SNAPSHOT ON at the database level and The transaction isolation level of the session is READ COMMITTED.

* Noel

Paul White SSC Guru Points: 150467 More actions · Answer 6

Bill Nicolich (4/5/2010)
...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.

A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Bill Nicolich SSCommitted Points: 1865 More actions · Answer 7

Paul White NZ (4/5/2010)
Bill Nicolich (4/5/2010)
...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.
A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.

When you match up a work item with a user, there are some considerations. What constitutes a user? In my case, a user was a unique session id from the web server. It wasn't a web farm situation. A user could have IE open with one session and FireFox open with another session - and these in my case should be handled separately. So if I just had just a "user id" matching things up in the table, I'd run into problems. So, that's why I added both a session id and a user id in the table.

Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow

Eric Strickland Valued Member Points: 65 More actions · Answer 8

For the Demo,

If doing this in 2008 and you just use the update "with (READPAST), If you do not move the Delay to before the commit Both sessions will pick up the same record.

Eric.

peleg SSCrazy Points: 2611 More actions · Answer 9

when i first looked at the title articles, i was sure it talkes about queues, that work in FIFO - first in -> first out.

but when i saw the row

SELECT TOP(1)

@work_queue_id = work_queue_id, @name = name

FROM

work_queue

WHERE processed_flag = 0

it dosen't ensure me that i will get the first available row from the table!

maybe it will be better to work with the Queue that the sql provides you, rather then using a table, and getting into situation where you find your self with looked rows and so on like Bill said.

Ahmad Alkilani SSC Enthusiast Points: 127 More actions · Answer 10

The solution simply illustrates in general what you can do to get around the problem. You can most certainly add ordering, user audits and resolutions to orphaned items. Then again, if you have a necessity to solve these issues in your application you might want to consider going with SQL Server Service Broker 🙂

SQLRNNR SSC Guru Points: 281336 More actions · Answer 11

Nice Article, and better discussion.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Steve Jones - SSC Editor SSC Guru Points: 742868 More actions · Answer 12

We used to do something similar for sending email here on the site. we had multiple machines that sent, and needed to read a database table to get their list of "things to send".

Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.

Upon the completion of processing, the client updated each row to show it as processed and then started over with another update of new rows in the Q

Paul White SSC Guru Points: 150467 More actions · Answer 13

Steve Jones - Editor (4/5/2010)
Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.

This is exactly the pattern used at two places I have worked, and the basis for the code posted earlier in the thread. Of course all that was before the days of Service Broker, but with careful coding and consideration for the concurrency and locking issues involved, it can be made to work very well.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

tony.turner Right there with Babe Points: 783 More actions · Answer 14

In my view the code fragment from Ahmad is missing a number of considerations that should be taken into account for a practical SQL table-based queueing systems:

* Polling SQL table based queues, as opposed to Service Broker, scales poorly. If you have multiple queues, or you need to keep the queue latency down, then Service Broker (or probably MSMQ) is by far your better answer. Assume 20 queues, and you want to keep the average enqueue time to process time below 50 milliseconds, you may be issuing 200 individual SQL requests a second, incurring a substantial constant overhead

* In a practical environment you should probably only COMMIT the change to the queue entry once you have fully processed the entry. Your SQL 2008 code specifically precludes doing this. If you COMMIT before processing the queue entry you need some way of tracking orphaned entries, which would include updating the queue entry with process identification, date/time, etc before COMMIT and a separately executing process to detect of orphaned entries and requeue them. This also raises the whole question of how long you can afford an orphaned entry to be orphaned for before it is detected, potential duplicate processing reversed, and finally processed

* Probably the most vexed issue in queue processing is poison messages; messages that, for whatever reason, your code cannot handle. You cannot realistically keep the status of the queue entry in the queue entry itself because you may ROLLBACK rather than COMMIT and process the entry repeatedlly. In my view the simplest answer is two connections; one is transacted and handles the queue, one is untransacted and keeps track of the status of the queue entry (number of retries, etc). In retrieving a queue entry you should retrieve both the queue entry and its associated status entry, and only retrieve entries which have not been re-processed excessively

* The READPAST locking hint may skip messages not only if the row is locked (busy being INSERTed) but also if the index is busy. The nett effect is that you may retrieve no messages even though there are messages in the queue, or you may process messages out-of-sequence. An associated issue is that performance may degrade and your apparent queue become less stable as the number of handled queue messages increases. A better approach may be to DELETE the queue entry on processing completion rather than mark it as processed, and keep a separate history log

So, in summary. Simple, yes. Realistic, not close