Fastes way to increment an id column

Question

Post reply

Fastes way to increment an id column

Florian Reischl

SSC-Dedicated

Points: 37301
More actions
February 25, 2009 at 1:21 am

#384010

Comments posted to this topic are about the item Fastes way to increment an id column

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply

Carlo Romagnano SSC-Insane Points: 22596 More actions · Answer 1

Carlo Romagnano

SSC-Insane

Points: 22596

April 2, 2009 at 1:19 am

#970408

???????????????

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 2

Hi Carlo

You have been faster than me... I wanted to explain directly after it becomes published.

Sure since SQL Server 2005 this is not needed any more because of the ROW_NUBER() function!

The reason for the script is handling of unique identities within SQL Server 7.0/2000 in bulk operations. It is no big clue but I often get the question from some of my developers how to handle a incremental id without a loop. (In SQL Server 2000)

For SQL Server 2005 this makes no sense! I will update the description to explain.

Hope that explains...

Greets

Flo

Carlo Romagnano SSC-Insane Points: 22596 More actions · Answer 3

-- what about IDENTITY??

SET NOCOUNT ON

-- Just a simple test table

IF (OBJECT_ID('tempdb..#my_tab') IS NOT NULL)

DROP TABLE #my_tab

CREATE TABLE #my_tab (id INT NOT NULL IDENTITY(1000,1), any_date DATETIME)

-- Create 100000 test values

PRINT 'Test data'

PRINT '-> Creating'

DECLARE @i INT

SET @i = 0

WHILE (@i < 100000)

BEGIN

INSERT INTO #my_tab (any_date) VALUES (DATEADD(MINUTE, @i * -1, GETDATE()))

SET @i = @i + 1

END

PRINT '<- done'

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 4

Hi Carlo

E.g. in our business case we are not able to use IDENTITY due to problems with replication. If you can/could use IDENTITY columns in SSE 2000 you don't need it. But if you use replications there some cases which make problems with them because of the seed.

Greets

Flo

Carlo Romagnano SSC-Insane Points: 22596 More actions · Answer 5

-- try to use this for replication:

set identity_insert my_table on

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 6

Hi Carlo

Thanks for feedback! There is another reason for me that I cannot use IDENTITIES, our C# OR-Mapper cannot handle them (not my design...).

I'm no DBA, I'm a developer so I cannot explain all the problems with replication. My DBA told me and Noeld confirmed it here:

http://www.sqlservercentral.com/Forums/Topic669595-338-1.aspx

I will update the description of the script to ensure that everybody understands the reason of it!

Greets

Flo

Jeff Moden SSC Guru Points: 1003851 More actions · Answer 7

I agree with Flo... if, for some reason, you cannot use IDENTITY and you're stuck with SQL Server 2000 or less which prevents the use of ROW_NUMBER and the like, then the "quirky" update Flo used is absolutely the fastest method available. That method is good for a lot of things like running totals, ranking, and some incredibly odd groupings.

There are, however, some fairly strict rules to using it if you actually want to control what it does. Those rules and some of the methods are covered by an article (currently being rewritten) that will reappear at the following URL sometime in the next month or so.

http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

If I could also make a wee bit of a suggestion... if you want to build a test table for something, there's simply no reason to go through the complexity of building a While Loop nor the wait while it works... not even in SQL Server 2000. To wit, the While Loop in the article takes a little over 7 seconds to build the test table. Using a little set-based programming and a little knowledge of the functions available in SQL Server, it can be done with much less complexity and it executes in about 250 milli-seconds.

[font="Courier New"]--===== Conditionally drop the test table for test reruns

IF OBJECT_ID('tempdb..#my_tab') IS NOT NULL

DROP TABLE #my_tab

--===== Build the test table and populate it with data on the fly

SET STATISTICS TIME ON

SELECT TOP 100000

CAST(NULL AS INT) AS ID,

DATEADD(mi,-ABS(CHECKSUM(NEWID())%100000),GETDATE()) AS Any_Date

INTO #My_Tab

FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

SET STATISTICS TIME OFF[/font]

I mention this because, frequently, testing against larger numbers of rows is ensential to ensuring performance in the face of scalability. To test against a million rows, the While Loop takes a whopping big 80+ seconds on my box. Worse yet, it also practices you in RBAR thinking instead of set based thinking. The simple set based solution only takes 2.319 seconds to produce the same million row test table. Folks are much more likely to test repeatedly if they can setup test data that quickly.

Heh... yeah... I know... Someone once told me that this was easy for me but not for them. That's my point... if they keep practicing RBAR methods instead of learning how to do the very high speed set based methods, it will never be easy for them. 😛

Anyway... good tip, Flo. Keep 'em coming. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 8

Thanks Jeff!

It's not a big deal but if I get one Dollar for every time I showed somebody to remove a cursor I would be a rich man...

Best wishes

Flo

Jeff Moden SSC Guru Points: 1003851 More actions · Answer 9

Florian Reischl (5/17/2009)
Thanks Jeff!
It's not a big deal but if I get one Dollar for every time I showed somebody to remove a cursor I would be a rich man...
Best wishes
Flo

Heh... OK... you owe me a Dollar. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 10

Why? :unsure:

I use this since about 5 years. Long time before I joined SSC 🙂

Jeff Moden SSC Guru Points: 1003851 More actions · Answer 11

Florian Reischl (5/17/2009)
Why? :unsure:
I use this since about 5 years. Long time before I joined SSC 🙂

Heh... ok... then 5 bucks for posting a cursor solution to a set based problem .:-P

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150467 More actions · Answer 12

Hey Jeff & Flo,

Just a couple of things... 🙂

1. This isn't a true 'quirky' update unless there's a clustered index to determine the order of the ID assignment. It is true that the current SQL Server behaviour is always to return rows from heaps in insertion order (at least with a new heap that has never had any update or delete activity on it), but still.

2. This method runs in less than half the time (2005 only):

SET NOCOUNT ON

--===== Conditionally drop the test table for test reruns

IF OBJECT_ID(N'tempdb..#my_tab', N'U') IS NOT NULL DROP TABLE #my_tab;

IF OBJECT_ID(N'tempdb..#my_tab2', N'U') IS NOT NULL DROP TABLE #my_tab2;

--===== Build the test table and populate it with data on the fly

SELECT TOP 100000

ISNULL(CAST(0 AS INT), 0) AS ID,

DATEADD(mi,-ABS(CHECKSUM(NEWID())%100000),GETDATE()) AS Any_Date

INTO #My_Tab

FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

DECLARE @now DATETIME;

SET @now = GETDATE();

-- Bulk logged *temporary* IDENTITY addition

SELECT id = IDENTITY(INT, 1000, 1),

any_date

INTO #my_tab2

FROM #my_tab WITH (TABLOCK);

-- Fast clear of the original table

TRUNCATE TABLE #my_tab;

-- Metadata only change

ALTER TABLE #my_tab2 SWITCH TO #my_tab;

PRINT 'Duration for 100000 values: ' + CONVERT(VARCHAR(20), DATEDIFF(MILLISECOND, @now, GETDATE()));

-- Tidy up

DROP TABLE

#my_tab, #my_tab2;

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150467 More actions · Answer 13

In case you guys think I'm cheating by changing the nullability of the id column, this version is a true replacement, and is just as fast:

SET NOCOUNT ON;

-- Conditional drops

IF OBJECT_ID(N'tempdb..#my_tab', N'U') IS NOT NULL DROP TABLE #my_tab;

IF OBJECT_ID(N'tempdb..#my_tab', N'U') IS NOT NULL DROP TABLE #my_tab2;

IF OBJECT_ID(N'tempdb..#my_tab', N'U') IS NOT NULL DROP TABLE #my_tab3;

-- Sample data

SELECT TOP (100000)

CAST(NULL AS INT) AS ID,

DATEADD(mi,-ABS(CHECKSUM(NEWID())%100000),GETDATE()) AS Any_Date

INTO #My_Tab

FROM master.sys.all_columns T1, master.sys.all_columns T2, master.sys.all_columns T3;

-- Start timing

DECLARE @now DATETIME;

SET @now = GETDATE();

-- Bulk logged identity add (id column is NOT NULL)

SELECT id = IDENTITY(INT, 1000, 1),

any_date

INTO #my_tab2

FROM #my_tab WITH (TABLOCK);

-- Intermediate table for nullability change

CREATE TABLE #my_tab3 (id INT NOT NULL, any_date DATETIME NULL);

-- Lose the IDENTITY attribute

ALTER TABLE #my_tab2 SWITCH TO #my_tab3;

-- Metadata change to id column nullability

ALTER TABLE #my_tab3 ALTER COLUMN id INT NULL;

-- Fast clear original table

TRUNCATE TABLE #my_tab;

-- Metadata switch data in

ALTER TABLE #my_tab3 SWITCH TO #my_tab;

-- Show duration

PRINT 'Duration for 100000 values: ' + CONVERT(VARCHAR(20), DATEDIFF(MILLISECOND, @now, GETDATE()));

-- Tidy up

DROP TABLE

#my_tab, #my_tab2, #my_tab3;

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150467 More actions · Answer 14

Jeff Moden (5/17/2009)
.. if, for some reason, you cannot use IDENTITY and you're stuck with SQL Server 2000 or less which prevents the use of ROW_NUMBER and the like, then the "quirky" update Flo used is absolutely the fastest method available.

Heh :laugh:

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi