Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Dealing With Changing Data

By Christoffer Hedgate,

A fairly common problem or need you encounter is managing changes of data stored in a table. Sometimes you simply need to be able to see that data has changed, other times you need to know when it changed, how many times it's changed etc. This article will discuss a couple of ways of handling issues like this.

Requirements and functionality

First of all you must establish what the requirements are. Which data type to use and how to use it depends on the requirements and how you want to implement the necessary functionality. The simplest, and probably most common requirement is to be able to check if the data has changed since you last checked (i.e. when the data was read into an application). With this functionality you are able to avoid having different users overwriting each other's data changes. Script 1 below show's an example of how things can go wrong if this is not handled.
-- Script 1
USE tempdb
GO

SET NOCOUNT ON

IF EXISTS (SELECT 1 FROM sysobjects WHERE [name] = 'Messages')
BEGIN
 DROP TABLE Messages
END

CREATE TABLE Messages
(
 MessageId int NOT NULL
 , Title varchar(50) NOT NULL
 , Body varchar(255) NOT NULL
)
GO

INSERT INTO Messages (MessageId, Title, Body) VALUES (1, 'A message', 'A message text')

-- User A reads the data
DECLARE @bodyA varchar(255)
SELECT @bodyA = Body FROM Messages WHERE MessageId = 1

-- User B also reads the data
DECLARE @bodyB varchar(255)
SELECT @bodyB = Body FROM Messages WHERE MessageId = 1

-- After changing the text User A now updates with his new data
SET @bodyA = 'Another text, written by User A'
UPDATE Messages SET Body = @bodyA

-- After this User B saves his changed version of the data,
-- thereby overwriting the change that User A did earlier
SET @bodyB = 'A third text, written by User B'
UPDATE Messages SET Body = @bodyB

SELECT * FROM Messages

SET NOCOUNT OFF

Another way of handling this requirement is of course to use pessimistic concurrency. This means that data is locked as soon as it is read and no other user is able to read that data until the locking user releases it. The SQL Server way of doing this is of course to use Update Locks instead of Shared Locks when reading data. Normally though you don't want to lock data like this until you are actually going to update it so that other users don't have to wait for you to release the data, especially if it is not even sure that you will change the data. The alternative is of course called optimistic concurrency, which means that all users can read the same data. To avoid overwriting each other's changes you instead check before updating that the data in the database haven't been changed by someone else since you read it. There are several ways to do this and the simplest way is also the most automatic one, but we'll wait with that for last. Instead we'll start with some other variants.

Last changed

A common solution is to use a column of some date type. This column will hold the date and time when the data on a row was last changed (i.e. either INSERTed or UPDATEd). The actual updating of this datetime info is either handled 'manually' by the application that updates the database, or automatically by triggers in the database. Script 2 shows an example of how to hande it with triggers which is normally the way I would handle it, mainly because as a DBA I want to make sure that the data in the database is always correct and up to date. If the application is left to handle this there is always a possibility that some other application (present or future) will not handle it and then you will not have correct data.
-- Script 2: Demonstration of trigger for handling last updated column
USE tempdb
GO

IF EXISTS (SELECT 1 FROM sysobjects WHERE [name] = 'Messages')
BEGIN
 DROP TABLE Messages
END

CREATE TABLE Messages 
(
 MessageId int NOT NULL CONSTRAINT pkMessages PRIMARY KEY CLUSTERED
 , Title varchar(50) NOT NULL
 , Body varchar(255) NOT NULL
 , ChangeDate datetime NOT NULL DEFAULT GETDATE()
)
GO

CREATE TRIGGER TrU_Messages
ON Messages
FOR UPDATE
AS
BEGIN
 UPDATE Messages SET ChangeDate = GETDATE()
 WHERE MessageID IN (SELECT MessageID FROM INSERTED)
END
GO

SET NOCOUNT ON

INSERT INTO Messages (MessageId, Title, Body) VALUES (1, 'A message', 'A message text')

SELECT * FROM Messages

-- Wait 5 seconds to see the difference in ChangeDate
WAITFOR DELAY '00:00:05'

UPDATE Messages SET Body = 'Another text' WHERE MessageId = 1

SELECT * FROM Messages

SET NOCOUNT OFF

Timestamp

With the trigger in Script 2 we are now able to check when a row was last changed. With this information we can now implement functionality to make sure that User B is not able to overwrite the data that User A has already changed. Again, the functionality can either be placed in the application where the changes are done, or we can do it in the database with either a trigger or a procedure. Since I like to restrict users from having direct access to tables and instead giving them procedures to use Script 3 shows an example of a procedure that updates the data, if it has not changed since it was read by the user.
-- Script 3
USE tempdb
GO

-- Make sure that the table Messages and trigger TrU_Messages
-- created in Script 2 does still exist, otherwise recreate them

IF EXISTS (SELECT 1 FROM sysobjects WHERE [name] = 'MessagesUpd')
BEGIN
 DROP PROCEDURE MessagesUpd
END
GO

CREATE PROCEDURE MessagesUpd
 @messageId int
 , @body varchar(255)
 , @readDate datetime
AS
BEGIN
 SET NOCOUNT ON

 UPDATE Messages SET Body = @body
 WHERE MessageId = @messageId
 AND ChangeDate = @readDate

 IF (@@rowcount = 0)
 BEGIN
  PRINT 'The data had already been changed since read date.'
 END

 SET NOCOUNT OFF
END
GO

DECLARE @body varchar(255), @readDate datetime, @otherDate datetime
DECLARE @messageId int
SET @messageId = 1

SELECT @body = Body, @readDate = ChangeDate 
FROM Messages 
WHERE MessageId = @messageId

SET @otherDate = DATEADD(d, -1, @readDate)
SET @body = 'A changed text'

-- This will not work because the date is not the same as the one stored
-- in table
EXEC MessagesUpd @messageId, @body, @otherDate

SELECT * FROM Messages

-- This works
EXEC MessagesUpd @messageId, @body, @readDate

SELECT * FROM Messages

This technique is often called timestamp. By always retrieving the current timestamp that the data has when SELECTing it, it is possible to check whether the data has changed prior to updating it by comparing the timestamps, thereby applying optimistic concurrency. If the row has a different timestamp than it had when it was retrieved the user will have to SELECT the new version of the data (including the new timestamp that is now the current one), decide whether or not he still wants to do the change and if so try and update again.

A simpler stamp

If the requirements don't state that you actually need to see the specific date and time when a row was changed but just need to see that it has been changed it might be better to use an integer as a kind of simple stamp. When the data is first INSERTed into the database it gets a default value (0 or 1 or whatever) and then the trigger adds 1 to this each time the row is changed. In order to update a row the user must supply the value that the stamp had when the data was read, and then the procedure checks that the row does not have a different stamp before updating it. The effect will be the same as using a datetime stamp, but instead of being able to see when the row was changed it is instead possible to see how many times it has been changed. It is only a matter of what the requirements are. Also, note that an integer takes less space than a datetime and it is impossible to have a 'collision' (two updates at almost the same time could perhaps get the same timestamp), but you can of course run out of values with the integer. I doubt that the same row will ever be changed more than 2,147,483,647 times though.

Rowversion

Very often though you don't need to know when the data was changed nor how many times it's been changed. The only requirement is to make sure that the users don't overwrite eachother's data changes. The simplest way to handle this is to use the functionality that SQL Server has for this requirement, namely the rowversion data type. Unfortunately this data type was earlier called timestamp in SQL Server, which is unfortunate since it is not at all the same thing as the timestamp that is described in the ANSI-SQL standard. The reason it was called timestamp is because it existed in SQL Server before timestamp existed in the standard. For backwards compatibility reasons it is still possible to use the name timestamp for it, but just as most other things left only for backwards compatibility in SQL Server it is recommended to use the new name rowversion because the definition of timestamp might be changed in a future version of SQL Server to adhere to the ANSI definition.
 
A rowversion in SQL Server is actually a binary(8) (unless it allows NULLs, in which case it is a varbinary(8)), which is automatically changed when a row is INSERTed or UPDATEd. A table can only have one rowversion column and it is not possible to update it manually, it is always handled by SQL Server automatically.
 
Just like when using a datetime or integer stamp you use it by retrieving the current value when SELECTing data and then comparing it against the one stored in the database when you want to update the row. With a column of data type rowversion in a table it is also possible to use SQL Server's built-in functionality for optimistic concurrency. This means that you can SELECT data to a recordset in a client application and handle it locally using for example a data grid. When you're finished changing the data in the grid you connect the recordset to the database again and use Update or UpdateBatch (to update several rows at once). SQL server will now automatically check if the row(s) have been changed between the time when they where read and the update by comparing the values in the rowversion column. It is actually possible to use automatic optimistic concurrency control without a rowversion column but in that case SQL Server needs to compare all the current data of all columns to the old values they had when the recordset was originally opened. In that case the recordset must always store both the original data and the new data that has been changed locally.

Summary

Managing changes to data in SQL Server is as you can see not very complex, but it can be very important. In the multi-user web applications of today it is not normally possible to lock data while working with it so instead you must use optimistic concurrency. If it is not handled users risk losing work and time, and as you know, time is money.
Total article views: 9422 | Views in the last 30 days: 11
 
Related Articles
FORUM

Update and Select

Update and Select

FORUM

Select Update Query Help

Select Update Query Help

FORUM

Update Script to select

Update Script to select

FORUM

select case Insert, Update

Have problem using select case with update and delete

FORUM

Need to change a message string

debugging a procedure and I need to change the error log message

Tags
miscellaneous    
performance tuning    
programming    
sql server 7    
strategies    
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones